Text mining with apache hadoop over different hadoop clusters architectures

E. Laxmi Lydia; Gorapalli Chandra Sekhar; Madhu Babu Chevuru; Dasari Ramya; K. Vijaya Kumar

Journal ArticleOPEN ACCESS

Text mining with apache hadoop over different hadoop clusters architectures

International Journal of Recent Technology and Engineering (2019) 8(2) 1252-1256

DOI: 10.35940/ijrte.B1866.078219

1Citations

6Readers

Get full text

Abstract

Big data is very much practical for real time applicational systems. One of the mostly used real time application worldwide are on unstructured documents. Large number of documents are managed and maintained through popular leadingBig Data platform is Hadoop. It maintains all the information at Hadoop Distributed File System in Blocks. Irrespective of datasize, BigData has opened its path to store and analyze the data which has consumed time. To overcome this, Hadoophas designed cluster process for large volumes of unstructured data computations. Three different cluster architectures like Standalone, Single node cluster and multi node clusters are considered. In this paper, Big Data allows Hadoop platform to boost the processing speed overlarge datasets through cluster architectures, which are studied and analyzed through text documents from newsgroup20 dataset.It identifies the challenges on text mining and its applications using ApacheHadoop.

Author supplied keywords

Cite

CITATION STYLE

APA

Lydia, E. L., Sekhar, G. C., Chevuru, M. B., Ramya, D., & Vijaya Kumar, K. (2019). Text mining with apache hadoop over different hadoop clusters architectures. International Journal of Recent Technology and Engineering, 8(2), 1252–1256. https://doi.org/10.35940/ijrte.B1866.078219

Text mining with apache hadoop over different hadoop clusters architectures

Abstract

Author supplied keywords

Cite

Register to see more suggestions