Data clustering: Integrating different distance measures with modified k-means algorithm

13Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Unsupervised learning is the process to partition the given data set into number of clusters where similar data objects belongs same cluster and dissimilar data objects belongs to another cluster. k-Means is the partition based unsupervised learning algorithm which is popular for its simplicity and ease of use. Yet, k-Means suffers from the major shortcoming of passing number of clusters and centroids in advance. Decimal scaling is one of the normalization approaches which standardize the features of the dataset and improve the effectiveness of the algorithm. Integrating different distance measures with modified k-Means algorithm help to select the proper distance measure for specific data mining application. This paper compare the results of modified k-Means with different distance measures like Euclidean Distance, Manhattan Distance, Minkowski Distance, Cosine Measure Distance and the Decimal Scaling normalization approach. Result Analysis is taken on various datasets from UCI machine dataset repository and shows that Mk-Means is advantageous and improve the effectiveness with normalized approach and Minkowski distance measure. © 2012 Springer India Pvt. Ltd.

Cite

CITATION STYLE

APA

Patel, V. R., & Mehta, R. G. (2012). Data clustering: Integrating different distance measures with modified k-means algorithm. In Advances in Intelligent and Soft Computing (Vol. 131 AISC, pp. 691–700). https://doi.org/10.1007/978-81-322-0491-6_63

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free