Clustering mixed type attributes in large dataset

Jian Yin; Zhifang Tan

Conference Proceedings

Clustering mixed type attributes in large dataset

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2005) 3758 LNCS 655-661

DOI: 10.1007/11576235_66

9Citations

22Readers

Get full text

Abstract

Clustering is a widely used technique in data mining, now there exists many clustering algorithms, but most existing clustering algorithms either are limited to handle the single attribute or can handle both data types but are not efficient when clustering large data sets. Few algorithms can do both well. In this paper, we propose a clustering algorithm CFIKP that can handle large datasets with mixed type of attributes. We first use CF*-tree to pre-cluster datasets. After the dense regions are stored in leaf nodes, then we look every dense region as a single point and use an improved k-prototype to cluster such dense regions. Experiments show that the CFIKP algorithm is very efficient in clustering large datasets with mixed type of attributes. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Yin, J., & Tan, Z. (2005). Clustering mixed type attributes in large dataset. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3758 LNCS, pp. 655–661). https://doi.org/10.1007/11576235_66

Clustering mixed type attributes in large dataset

Abstract

Cite

Register to see more suggestions