The optimization and improvement of mapreduce in web data mining

Changqing Yin; Shichao Zhang; Shukun Liu; Shangwei Song; Guangyu Gao; Xiyuan Zhou

Conference Proceedings

The optimization and improvement of mapreduce in web data mining

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9528 755-762

DOI: 10.1007/978-3-319-27119-4_53

N/ACitations

7Readers

Get full text

Abstract

Extracting and mining social networks information from massive Web data is of both theoretical and practical significance. However, one of definite features of this task was a large scale data processing, which remained to be a great challenge that would be addressed. MapReduce is a kind of distributed programming model. Just through the implementation of map and reduce those two functions, the distributed tasks can work well. Nevertheless, this model does not directly support heterogeneous datasets processing, while heterogeneous datasets are common in Web. This article proposes a new framework which improves original MapReduce framework into a new one called Map-Reduce-Merge. It adds merge phase that can efficiently solve the problems of heterogeneous data processing. At the same time, some works of optimization and improvement are done based on the features of Web data.

Author supplied keywords

Cite

CITATION STYLE

APA

Yin, C., Zhang, S., Liu, S., Song, S., Gao, G., & Zhou, X. (2015). The optimization and improvement of mapreduce in web data mining. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9528, pp. 755–762). Springer Verlag. https://doi.org/10.1007/978-3-319-27119-4_53

The optimization and improvement of mapreduce in web data mining

Abstract

Author supplied keywords

Cite

Register to see more suggestions