Clustering categorical data is an important and challenging data analysis task. In this paper, we explore the use of kernel K-means to cluster categorical data. We propose a new kernel function based on Hamming distance to embed categorical data in a constructed feature space where the clustering is conducted. We experimentally evaluated the quality of the solutions produced by kernel K-means on real datasets. Results indicated the feasibility of kernel K-means using our proposed kernel function to discover clusters embedded in categorical data. © Springer-Verlag Berlin Heidelberg 2005.
CITATION STYLE
Couto, J. (2005). Kernel K-means for categorical data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3646 LNCS, pp. 46–56). Springer Verlag. https://doi.org/10.1007/11552253_5
Mendeley helps you to discover research relevant for your work.