Mapcombine: A lightweight solution to improve the efficiency of iterative mapreduce

Wei Xu; Xiujun Gong; Xiaoyu Li

Journal Article

Mapcombine: A lightweight solution to improve the efficiency of iterative mapreduce

Communications in Computer and Information Science (2013) 332 444-456

DOI: 10.1007/978-3-642-34447-3_40

0Citations

8Readers

Get full text

Abstract

MapReduce is a brilliant distributed computing strategy to process massive-scale data. However, for iterative applications, the general MapReduce needs to re-initialize runtime environment repetitively and re-load static data repetitively in every iteration. Thus, a great deal of CPU time and I/O bandwidth are wasted. This paper presents a lightweight solution to improve the efficiency of iterative MapReduce, which named MapCombine. The main contributions of MapCombine are as follows: (1) To avoid re-initialization of the runtime environment, a controller component is plugged into the general MapReduce model to schedule the iterations; (2) To process data without reloading the static subset, we modify the general MapReduce model surrounding combine phase to cache the fixed data and 4e the workload before processing; (3) To make the communication between the controller and the combiners flexible with the consideration of fault tolerance and downtime recovery, we append an interaction layer to the MapReduce implementation architecture. We also show performance comparisons between MapCombine and Mahout for four clustering algorithms, and then conclude that the average speedup ratio provided by MapCombine is 1.14. © Springer-Verlag Berlin Heidelberg 2012.

Author supplied keywords

Cite

CITATION STYLE

APA

Xu, W., Gong, X., & Li, X. (2013). Mapcombine: A lightweight solution to improve the efficiency of iterative mapreduce. Communications in Computer and Information Science, 332, 444–456. https://doi.org/10.1007/978-3-642-34447-3_40

Mapcombine: A lightweight solution to improve the efficiency of iterative mapreduce

Abstract

Author supplied keywords

Cite

Register to see more suggestions