MapReduce Parallel Programming Model: A State-of-the-Art Survey

Ren Li; Haibo Hu; Heng Li; Yunsong Wu; Jianxi Yang

Article

MapReduce Parallel Programming Model: A State-of-the-Art Survey

Li R
Hu H
Li H
et al.

International Journal of Parallel Programming

DOI: 10.1007/s10766-015-0395-0

49Citations

104Readers

Get full text

Abstract

With the development of information technologies, we have entered the era of Big Data. Google’s MapReduce programming model and its open-source implementation in Apache Hadoop have become the dominant model for data-intensive processing because of its simplicity, scalability, and fault tolerance. However, several inherent limitations, such as lack of efficient scheduling and iteration computing mechanisms, seriously affect the efficiency and flexibility of MapReduce. To date, various approaches have been proposed to extend MapReduce model and improve runtime efficiency for different scenarios. In this review, we assess MapReduce to help researchers better understand these novel optimizations that have been taken to address its limitations. We first present the basic idea underlying MapReduce paradigm and describe several widely used open-source runtime systems. And then we discuss the main shortcomings of original MapReduce. We also review these MapReduce optimization approaches that have recently been put forward, and categorize them according to the characteristics and capabilities. Finally, we conclude the paper and suggest several research works that should be carried out in the future.

Author supplied keywords

Cite

CITATION STYLE

APA

Li, R., Hu, H., Li, H., Wu, Y., & Yang, J. (2016, August 1). MapReduce Parallel Programming Model: A State-of-the-Art Survey. International Journal of Parallel Programming. Springer New York LLC. https://doi.org/10.1007/s10766-015-0395-0

MapReduce Parallel Programming Model: A State-of-the-Art Survey

Abstract

Author supplied keywords

Cite

Register to see more suggestions