Text mining with apache hadoop over different hadoop clusters architectures

Citations of this article
Mendeley users who have this article in their library.
Get full text


Big data is very much practical for real time applicational systems. One of the mostly used real time application worldwide are on unstructured documents. Large number of documents are managed and maintained through popular leadingBig Data platform is Hadoop. It maintains all the information at Hadoop Distributed File System in Blocks. Irrespective of datasize, BigData has opened its path to store and analyze the data which has consumed time. To overcome this, Hadoophas designed cluster process for large volumes of unstructured data computations. Three different cluster architectures like Standalone, Single node cluster and multi node clusters are considered. In this paper, Big Data allows Hadoop platform to boost the processing speed overlarge datasets through cluster architectures, which are studied and analyzed through text documents from newsgroup20 dataset.It identifies the challenges on text mining and its applications using ApacheHadoop.




Lydia, E. L., Sekhar, G. C., Chevuru, M. B., Ramya, D., & Vijaya Kumar, K. (2019). Text mining with apache hadoop over different hadoop clusters architectures. International Journal of Recent Technology and Engineering, 8(2), 1252–1256. https://doi.org/10.35940/ijrte.B1866.078219

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free