A Comparative Performance Analysis of Fast K-Means Clustering Algorithms

0Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Data clustering is a fundamental and widespread problem in computer science, which has become very attractive in both scientific communities and application domains. Among the different algorithmic methods, the k-means algorithm, and its prominent implementation, the Lloyd algorithm, has developed into a de facto standard for partitioning-based clustering. This algorithm, however, turns out to be inefficient on very large databases. In order to mitigate this efficiency issue, several fast k-means algorithms for ad-hoc and exact data clustering have been proposed in the literature. Since their inner workings and applied pruning criteria differ, it is difficult to predict the efficiency of individual algorithms in certain application scenarios. We thus present a performance analysis of existing fast k-means algorithms. We focus on simple interpretability and comparability and abstract from many implementation details so as to provide a guide for data scientists and practitioners alike.

Cite

CITATION STYLE

APA

Beecks, C., Berns, F., Hüwel, J. D., Linxen, A., Schlake, G. S., & Düsterhus, T. (2022). A Comparative Performance Analysis of Fast K-Means Clustering Algorithms. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13635 LNCS, pp. 119–125). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-21047-1_11

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free