Performing Hierarchical Clustering on Huge Volumes of Data Using Enhanced Mapreduce Technique

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Among the various methods of clustering, hierarchical clustering is advantageous in many aspects. The implication of hierarchical clustering on large volumes of data is difficult as these data are normally unstructured, heterogeneous, in huge volumes, contains various types of noise and volatile. The Mapreduce framework is used to analyze huge volumes of data under parallel and distributed fashion. The efficiency of the algorithm can be improved by two optimization techniques viz. co-occurrence based feature selection and batch updating are used. Hence this paper presents a hierarchical clustering method using enhanced version of mapreduce framework for huge volumes of data. The research is conducted on web access log file containing 512 GB of data. The outcome of the results conducted by the algorithm show that the proposed method outperforms traditional clustering methods in terms of execution time and number of clusters formed.

Cite

CITATION STYLE

APA

Maheswari, K., & Ramakrishnan, M. (2020). Performing Hierarchical Clustering on Huge Volumes of Data Using Enhanced Mapreduce Technique. In Lecture Notes in Networks and Systems (Vol. 118, pp. 315–324). Springer. https://doi.org/10.1007/978-981-15-3284-9_36

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free