Data clustering: Integrating different distance measures with modified k-means algorithm

Vaishali R. Patel; Rupa G. Mehta

Conference Proceedings

Data clustering: Integrating different distance measures with modified k-means algorithm

Advances in Intelligent and Soft Computing (2012) 131 AISC(VOL. 2) 691-700

DOI: 10.1007/978-81-322-0491-6_63

13Citations

16Readers

Get full text

Abstract

Unsupervised learning is the process to partition the given data set into number of clusters where similar data objects belongs same cluster and dissimilar data objects belongs to another cluster. k-Means is the partition based unsupervised learning algorithm which is popular for its simplicity and ease of use. Yet, k-Means suffers from the major shortcoming of passing number of clusters and centroids in advance. Decimal scaling is one of the normalization approaches which standardize the features of the dataset and improve the effectiveness of the algorithm. Integrating different distance measures with modified k-Means algorithm help to select the proper distance measure for specific data mining application. This paper compare the results of modified k-Means with different distance measures like Euclidean Distance, Manhattan Distance, Minkowski Distance, Cosine Measure Distance and the Decimal Scaling normalization approach. Result Analysis is taken on various datasets from UCI machine dataset repository and shows that Mk-Means is advantageous and improve the effectiveness with normalized approach and Minkowski distance measure. © 2012 Springer India Pvt. Ltd.

Author supplied keywords

Cite

CITATION STYLE

APA

Patel, V. R., & Mehta, R. G. (2012). Data clustering: Integrating different distance measures with modified k-means algorithm. In Advances in Intelligent and Soft Computing (Vol. 131 AISC, pp. 691–700). https://doi.org/10.1007/978-81-322-0491-6_63

Data clustering: Integrating different distance measures with modified k-means algorithm

Abstract

Author supplied keywords

Cite

Register to see more suggestions