Handling small size files in hadoop: Challenges, opportunities, and review

Mohd Abdul Ahad; Ranjit Biswas

Book Chapter

Handling small size files in hadoop: Challenges, opportunities, and review

Springer Verlag, (2018), 653-663

DOI: 10.1007/978-981-13-0514-6_62

9Citations

6Readers

Get full text

Abstract

Recent technological advancements in the field of computing have been the cause of voluminous generation of data which cannot be handled effectively by traditionally available tools, processes, and systems. To effectively handle this big data, new techniques and frameworks have emerged in recent times. Hadoop is a prominent framework for managing huge amount of data. It provides efficient means for the storage, retrieval, processing, and analytics of big data. Although Hadoop works very well with large files, its performance tends to degrade when it is required to process hundreds or thousands of small size files. This paper puts forward the challenges and opportunities that may arise while handling large number of small size files. It also presents a comprehensive review of the various techniques available for efficiently handling small size files in Hadoop on the basis of certain performance parameters like access time, read/write complexity, scalability, and processing speed.

Author supplied keywords

Cite

CITATION STYLE

APA

Ahad, M. A., & Biswas, R. (2018). Handling small size files in hadoop: Challenges, opportunities, and review. In Advances in Intelligent Systems and Computing (Vol. 758, pp. 653–663). Springer Verlag. https://doi.org/10.1007/978-981-13-0514-6_62

Handling small size files in hadoop: Challenges, opportunities, and review

Abstract

Author supplied keywords

Cite

Register to see more suggestions