A Comparative Performance Analysis of Fast K-Means Clustering Algorithms

Christian Beecks; Fabian Berns; Jan David Hüwel; Andrea Linxen; Georg Stefan Schlake; Tim Düsterhus

Conference Proceedings

A Comparative Performance Analysis of Fast K-Means Clustering Algorithms

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2022) 13635 LNCS 119-125

DOI: 10.1007/978-3-031-21047-1_11

0Citations

11Readers

Get full text

Abstract

Data clustering is a fundamental and widespread problem in computer science, which has become very attractive in both scientific communities and application domains. Among the different algorithmic methods, the k-means algorithm, and its prominent implementation, the Lloyd algorithm, has developed into a de facto standard for partitioning-based clustering. This algorithm, however, turns out to be inefficient on very large databases. In order to mitigate this efficiency issue, several fast k-means algorithms for ad-hoc and exact data clustering have been proposed in the literature. Since their inner workings and applied pruning criteria differ, it is difficult to predict the efficiency of individual algorithms in certain application scenarios. We thus present a performance analysis of existing fast k-means algorithms. We focus on simple interpretability and comparability and abstract from many implementation details so as to provide a guide for data scientists and practitioners alike.

Author supplied keywords

Cite

CITATION STYLE

APA

Beecks, C., Berns, F., Hüwel, J. D., Linxen, A., Schlake, G. S., & Düsterhus, T. (2022). A Comparative Performance Analysis of Fast K-Means Clustering Algorithms. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13635 LNCS, pp. 119–125). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-21047-1_11

A Comparative Performance Analysis of Fast K-Means Clustering Algorithms

Abstract

Author supplied keywords

Cite

Register to see more suggestions