Initialization dependence of clustering algorithms

Wim De Mulder; Stefan Schliebs; René Boel; Martin Kuiper

Conference Proceedings

Initialization dependence of clustering algorithms

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2009) 5507 LNCS(PART 2) 615-622

DOI: 10.1007/978-3-642-03040-6_75

4Citations

2Readers

Get full text

Abstract

It is well known that the clusters produced by a clustering algorithm depend on the chosen initial centers. In this paper we present a measure for the degree to which a given clustering algorithm depends on the choice of initial centers, for a given data set. This measure is calculated for four well-known offline clustering algorithms (k-means Forgy, k-means Hartigan, k-means Lloyd and fuzzy c-means), for five benchmark data sets. The measure is also calculated for ECM, an online algorithm that does not require the number of initial centers as input, but for which the resulting clusters can depend on the order that the input arrives. Our main finding is that this initialization dependence measure can also be used to determine the optimal number of clusters. © 2009 Springer Berlin Heidelberg.

Cite

CITATION STYLE

APA

De Mulder, W., Schliebs, S., Boel, R., & Kuiper, M. (2009). Initialization dependence of clustering algorithms. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5507 LNCS, pp. 615–622). https://doi.org/10.1007/978-3-642-03040-6_75

Initialization dependence of clustering algorithms

Abstract

Cite

Register to see more suggestions