Size-constrained clustering using an initial points selection method

Kai Lei; Sibo Wang; Weiwei Song; Qilin Li

Conference Proceedings

Size-constrained clustering using an initial points selection method

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 8041 LNAI 195-205

DOI: 10.1007/978-3-642-39787-5_16

1Citations

4Readers

Get full text

Abstract

Size-Constrained clustering tries to solve the problem that how to classify dataset into groups based on each document's similarity with additional requirement which each group size is within a fixed range. By far, adding constraints to assignment step in K-Means clustering is a main approach. But the performance of the algorithm also depends highly on the initial cluster centers like standard K-Means. We propose an initial points selection method by recursively discovering the point with large density around it. Root Mean Square Error and convergence speed (iteration times) are the two most important evaluation standards for clustering using an iterative procedure. Our experiments are conducted on about ten thousand research proposals of National Natural Science Foundation of China and the results show that our method can reduce the iteration times by over 50% and get smaller Root Mean Square Error. The method is scalable and can be coupled with a scalable size-constrained clustering algorithm to address the large-scale clustering problem in data mining. © 2013 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Lei, K., Wang, S., Song, W., & Li, Q. (2013). Size-constrained clustering using an initial points selection method. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8041 LNAI, pp. 195–205). Springer Verlag. https://doi.org/10.1007/978-3-642-39787-5_16

Size-constrained clustering using an initial points selection method

Abstract

Author supplied keywords

Cite

Register to see more suggestions