Join optimization for large-scale data analysis in mapreduce

Li Zhang; Shicheng Xu; Chengbao Peng

Conference Proceedings

Join optimization for large-scale data analysis in mapreduce

Lecture Notes in Electrical Engineering (2013) 236 LNEE 651-657

DOI: 10.1007/978-1-4614-7010-6_73

0Citations

2Readers

Get full text

Abstract

As the coming of the big data age, there is a new hot spot on how to handle and process huge amounts of data. The MapReduce parallel computing framework is increasingly being used in large-scale data analysis. Although there have been many studies about the join operation in the traditional relational database, join algorithms in MapReduce are inefficient. In this paper, we describe a number of well-known join algorithms in MapReduce, and present an experimental comparison of these join algorithms based on Hadoop cluster. An optimization algorithm for map side chain is proposed. © 2013 Springer Science+Business Media New York.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhang, L., Xu, S., & Peng, C. (2013). Join optimization for large-scale data analysis in mapreduce. In Lecture Notes in Electrical Engineering (Vol. 236 LNEE, pp. 651–657). Springer Verlag. https://doi.org/10.1007/978-1-4614-7010-6_73

Join optimization for large-scale data analysis in mapreduce

Abstract

Author supplied keywords

Cite

Register to see more suggestions