A comparison of categorical attribute data clustering methods

Ville Hautamäki; Antti Pöllänen; Tomi Kinnunen; Kong Aik Lee; Haizhou Li; Pasi Fränti

Conference ProceedingsOPEN ACCESS

A comparison of categorical attribute data clustering methods

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8621 LNCS 53-62

DOI: 10.1007/978-3-662-44415-3_6

11Citations

21Readers

Get full text

Abstract

Clustering data in Euclidean space has a long tradition and there has been considerable attention on analyzing several different cost functions. Unfortunately these result rarely generalize to clustering of categorical attribute data. Instead, a simple heuristic k-modes is the most commonly used method despite its modest performance. In this study, we model clusters by their empirical distributions and use expected entropy as the objective function. A novel clustering algorithm is designed based on local search for this objective function and compared against six existing algorithms on well known data sets. The proposed method provides better clustering quality than the other iterative methods at the cost of higher time complexity. © 2014 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Hautamäki, V., Pöllänen, A., Kinnunen, T., Lee, K. A., Li, H., & Fränti, P. (2014). A comparison of categorical attribute data clustering methods. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8621 LNCS, pp. 53–62). Springer Verlag. https://doi.org/10.1007/978-3-662-44415-3_6

A comparison of categorical attribute data clustering methods

Abstract

Cite

Register to see more suggestions