A novel clustering approach using hadoop distributed environment

Nagesh Vadaparthi; P. Srinivas Rao; Y. Srinivas; M. Athmaja

Conference Proceedings

A novel clustering approach using hadoop distributed environment

SpringerBriefs in Applied Sciences and Technology (2015) 113-119

DOI: 10.1007/978-981-287-338-5_9

2Citations

1Readers

Get full text

Abstract

Nowadays, information retrieval plays a vital role by allowing users to retrieve documents of their interest based on relevance score. Such systems can be implemented either in distributed systems or parallel systems to achieve high throughput. If such kind of framework is deployed in a cloud, grouping of relevant documents is essential to retrieve documents of interest. Hence, an efficient and scalable clustering is required to process huge volume of documents. To handle huge documents and to provide scalability while processing Apache Hadoop is efficient with its powerful feature map reduce. Hence, in this paper, a novel approach is proposed that is capable of clustering bulk data with high throughput. This paper also demonstrates the need of parallel caching approach for obtaining effective results.

Author supplied keywords

Cite

CITATION STYLE

APA

Vadaparthi, N., Srinivas Rao, P., Srinivas, Y., & Athmaja, M. (2015). A novel clustering approach using hadoop distributed environment. In SpringerBriefs in Applied Sciences and Technology (pp. 113–119). Springer Verlag. https://doi.org/10.1007/978-981-287-338-5_9

A novel clustering approach using hadoop distributed environment

Abstract

Author supplied keywords

Cite

Register to see more suggestions