Improving mapreduce performance through complexity and performance based data placement in heterogeneous hadoop clusters

Rajashekhar M. Arasanal; Daanish U. Rumani

Conference Proceedings

Improving mapreduce performance through complexity and performance based data placement in heterogeneous hadoop clusters

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 7753 LNCS 115-125

DOI: 10.1007/978-3-642-36071-8_8

18Citations

38Readers

Get full text

Abstract

MapReduce has emerged as an important programming model with clusters having tens of thousands of nodes. Hadoop, an open source implementation of MapReduce may contain various nodes which are heterogeneous in their computing capacity for various reasons. It is important for the data placement algorithms to partition the input and intermediate data based on the computing capacities of the nodes in the cluster. We propose several enhancements to data placing algorithms in Hadoop such that the load is distributed across the nodes evenly. In this work, we propose two techniques to measure the computing capacities of the nodes. Secondly, we propose improvements to the input data distribution algorithm based on the map and reduce function complexities and the measured heterogeneity of nodes. Finally, we evaluate the improvement of the MapReduce performance. © 2013 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Arasanal, R. M., & Rumani, D. U. (2013). Improving mapreduce performance through complexity and performance based data placement in heterogeneous hadoop clusters. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7753 LNCS, pp. 115–125). Springer Verlag. https://doi.org/10.1007/978-3-642-36071-8_8

Improving mapreduce performance through complexity and performance based data placement in heterogeneous hadoop clusters

Abstract

Author supplied keywords

Cite

Register to see more suggestions