Average-link (AL) is a distance based hierarchical clustering method, which is not sensitive to the noisy patterns. However, like all hierarchical clustering methods AL also needs to scan the dataset many times. AL has time and space complexity of O(n2), where n is the size of the dataset. These prohibit the use of AL for large datasets. In this paper, we have proposed a distance based hierarchical clustering method termed l-AL which speeds up the classical AL method in any metric (vector or non-vector) space. In this scheme, first leaders clustering method is applied to the dataset to derive a set of leaders and subsequently AL clustering is applied to the leaders. To speed-up the leaders clustering method, reduction in distance computations is also proposed in this paper. Experimental results confirm that the l-AL method is considerably faster than the classical AL method yet keeping clustering results at par with the classical AL method. © 2010 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Patra, B. K., Hubballi, N., Biswas, S., & Nandi, S. (2010). Distance based fast hierarchical clustering method for large datasets. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6086 LNAI, pp. 50–59). https://doi.org/10.1007/978-3-642-13529-3_7
Mendeley helps you to discover research relevant for your work.