A novel map-reduce based augmented clustering algorithm for big text datasets

8Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Text clustering is a well known technique for improving quality in information retrieval, In Today’s real world data is not organized in the essential manner for a precise mining, given a large unstructured text document collection it is essential to organize into clusters of related documents. It is a contemporary challenge to explore compact and meaning insights from large collections of the unstructured text documents. Although many frequent item mining algorithms have been discovered yet most do not scale for “Big Data” and also takes more processing time. This paper presents a high scalable speedy and efficient map reduce based augmented clustering algorithm based on bivariate n-gram frequent item to reduce high dimensionality and derive high quality clusters for Big Text documents and also the comparative analysis is shown for the sample text datasets with stop word removal the proposed algorithm performs better than without stop word removal.

Cite

CITATION STYLE

APA

Kanimozhi, K. V., & Venkatesan, M. (2018). A novel map-reduce based augmented clustering algorithm for big text datasets. In Advances in Intelligent Systems and Computing (Vol. 542, pp. 427–436). Springer Verlag. https://doi.org/10.1007/978-981-10-3223-3_41

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free