July, 2023

W. J. F. Silva, P. J. C. Souza, R. M. C. R. Souza and F. J. A. Cysneiros, “A Clustering Algorithm for Polygonal Data Applied to Scientific Journal Profiles,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, doi: 10.1109/TPAMI.2023.3297022.


Millions of papers are submitted and published every year, but researchers often do not have much information about the journals that interest them. In this paper, we introduced the first dynamical clustering algorithm for symbolic polygonal data and this was applied to build scientific journals profiles. Dynamic clustering algorithms are a family of iterative two-step relocation algorithms involving the construction of clusters at each iteration and the identification of a suitable representation or prototype (means, axes, probability laws, groups of elements, etc.) for each cluster by locally optimizing an adequacy criterion that measures the fitting between clusters and their corresponding prototypes The application gives a powerful vision to understand the main variables that describe journals. Symbolic polygonal data can represent summarized extensive datasets taking into account variability. In addition, we developed cluster and partition interpretation indices for polygonal data that have the ability to extract insights about clustering results. From these indices, we discovered, e.g., that the number of difficult words in abstract is fundamental to building journal profiles.


Wagner J. F. Silva, Centro de Informatica, Universidade Federal de Pernambuco, Brasil

Pedro J.C. Souza, Centro de Informatica, Universidade Federal de Pernambuco, Brasil

Renata M.C.R. Souza, Centro de Informatica, Universidade Federal de Pernambuco, Brasil

Francisco Jose A. Cysneiros, Departamento de Estatistica, Universidade Federal de Pernambuco, Brasil

Comentários desativados

Sobre este site

Portal institucional do Centro de Informática – UFPE


Av. Jornalista Aníbal Fernandes, s/n – Cidade Universitária.
Recife-PE – Brasil
CEP: 50.740-560

Segunda–Sexta: 8:00–18:00