Handling small size files in hadoop: Challenges, opportunities, and review

9Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Recent technological advancements in the field of computing have been the cause of voluminous generation of data which cannot be handled effectively by traditionally available tools, processes, and systems. To effectively handle this big data, new techniques and frameworks have emerged in recent times. Hadoop is a prominent framework for managing huge amount of data. It provides efficient means for the storage, retrieval, processing, and analytics of big data. Although Hadoop works very well with large files, its performance tends to degrade when it is required to process hundreds or thousands of small size files. This paper puts forward the challenges and opportunities that may arise while handling large number of small size files. It also presents a comprehensive review of the various techniques available for efficiently handling small size files in Hadoop on the basis of certain performance parameters like access time, read/write complexity, scalability, and processing speed.

Cite

CITATION STYLE

APA

Ahad, M. A., & Biswas, R. (2018). Handling small size files in hadoop: Challenges, opportunities, and review. In Advances in Intelligent Systems and Computing (Vol. 758, pp. 653–663). Springer Verlag. https://doi.org/10.1007/978-981-13-0514-6_62

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free