A novel map-reduce based augmented clustering algorithm for big text datasets

K. V. Kanimozhi; M. Venkatesan

Conference Proceedings

A novel map-reduce based augmented clustering algorithm for big text datasets

Advances in Intelligent Systems and Computing (2018) 542 427-436

DOI: 10.1007/978-981-10-3223-3_41

8Citations

6Readers

Get full text

Abstract

Text clustering is a well known technique for improving quality in information retrieval, In Today’s real world data is not organized in the essential manner for a precise mining, given a large unstructured text document collection it is essential to organize into clusters of related documents. It is a contemporary challenge to explore compact and meaning insights from large collections of the unstructured text documents. Although many frequent item mining algorithms have been discovered yet most do not scale for “Big Data” and also takes more processing time. This paper presents a high scalable speedy and efficient map reduce based augmented clustering algorithm based on bivariate n-gram frequent item to reduce high dimensionality and derive high quality clusters for Big Text documents and also the comparative analysis is shown for the sample text datasets with stop word removal the proposed algorithm performs better than without stop word removal.

Author supplied keywords

Cite

CITATION STYLE

APA

Kanimozhi, K. V., & Venkatesan, M. (2018). A novel map-reduce based augmented clustering algorithm for big text datasets. In Advances in Intelligent Systems and Computing (Vol. 542, pp. 427–436). Springer Verlag. https://doi.org/10.1007/978-981-10-3223-3_41

A novel map-reduce based augmented clustering algorithm for big text datasets

Abstract

Author supplied keywords

Cite

Register to see more suggestions