Parallel K-means clustering algorithm on DNA dataset

15Citations
Citations of this article
35Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Clustering is a division of data into groups of similar objects. K-means has been used in many clustering work because of the ease of the algorithm. Our main effort is to parallelize the k-means clustering algorithm. The parallel version is implemented based on the inherent parallelism during the Distance Calculation and Centroid Update phases. The parallel K-means algorithm is designed in such a way that each P participating node is responsible for handling n/P data points. We run the program on a Linux Cluster with a maximum of eight nodes using message-passing programming model. We examined the performance based on the percentage of correct answers and its speed-up performance. The outcome shows that our parallel K-means program performs relatively well on large datasets.

Cite

CITATION STYLE

APA

Othman, F., Abdullah, R., Rashid, N. A., & Salam, R. A. (2004). Parallel K-means clustering algorithm on DNA dataset. In Lecture Notes in Computer Science (Vol. 3320, pp. 248–251). Springer Verlag. https://doi.org/10.1007/978-3-540-30501-9_54

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free