A practical comparison of two K-Means clustering algorithms

20Citations
Citations of this article
36Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Data clustering is a powerful technique for identifying data with similar characteristics, such as genes with similar expression patterns. However, not all implementations of clustering algorithms yield the same performance or the same clusters. Results: In this paper, we study two implementations of a general method for data clustering: k-means clustering. Our experimentation compares the running times and distance efficiency of Lloyd's K-means Clustering and the Progressive Greedy K-means Clustering. Conclusion: Based on our implementation, not just in processing time, but also in terms of mean squared-difference (MSD), Lloyd's K-means Clustering algorithm is more efficient. This analysis was performed using both a gene expression level sample and on randomly-generated datasets in three-dimensional space. However, other circumstances may dictate a different choice in some situations. © 2008 Wilkin and Huang; licensee BioMed Central Ltd.

Cite

CITATION STYLE

APA

Wilkin, G. A., & Huang, X. (2008). A practical comparison of two K-Means clustering algorithms. BMC Bioinformatics, 9(SUPPL. 6). https://doi.org/10.1186/1471-2105-9-S6-S19

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free