Categorical data: an approach to visualization for cluster analysis

Olga Mishulina; Maria Eidlina

Conference Proceedings

Categorical data: an approach to visualization for cluster analysis

Studies in Computational Intelligence (2019) 799 202-209

DOI: 10.1007/978-3-030-01328-8_23

0Citations

1Readers

Get full text

Abstract

The problem of studying the cluster structure of a set of objects with qualitative (categorical) features is considered. We propose an approach to visualization of source data and categorical data groups in a form that is convenient for human analysis and decision-making. We generalized Andrews’ idea of numeric data visualization for the case of categorical data set. The developed approach can be applied in the case when the frequency distribution of the joint appearance of feature pairs in the data sample is known. For visualization, it is proposed to use not the primary features of the data set, but new paired features that have a strong statistical relationship. In addition, we have corrected the spectral representation of Andrews curves, limiting the maximum frequency of harmonic functions. The proposed visual representation of categorical data makes it possible to estimate the number of clusters in a data set and show their differences. The technique is demonstrated on a model example in which the decision on the number of clusters is taken in conjunction with two other ways of visualizing data clusters: a silhouette and a heat map.

Author supplied keywords

Cite

CITATION STYLE

APA

Mishulina, O., & Eidlina, M. (2019). Categorical data: an approach to visualization for cluster analysis. In Studies in Computational Intelligence (Vol. 799, pp. 202–209). Springer Verlag. https://doi.org/10.1007/978-3-030-01328-8_23

Categorical data: an approach to visualization for cluster analysis

Abstract

Author supplied keywords

Cite

Register to see more suggestions