Resource Efficiency Optimization for Big Data Mining Algorithm with Multi-MapReduce Collaboration Scenario

0Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Because any MapReduce job requires a series of complex operations such as task scheduling and resource allocation independently, there are a lot of redundant disk I/O and resource duplicate application operations among multiple MapReduce jobs coordinated by the same algorithm, causing inefficient resource utilization in job computing process. Big data mining algorithms are usually divided into several MapReduce Jobs, taking ItemBased algorithm as an example, this paper has analyzed the resource efficiency of mining algorithm with multi-MapReduce job collaboration scenario. It proposed an ItemBased algorithm based on DistributedCache, which used DistributedCache to cache I/O data between multiple MapReduce Jobs, breaks the defect of independence between jobs, and reduced the waiting delay between Map and Reduce tasks. The experimental results show that, DistributedCache can improve the data reading speed of MapReduce jobs. The algorithm reconstructed by DistributedCache greatly reduces the waiting delay between Map and Reduce tasks, and improves the resource efficiency by more than three times.

Cite

CITATION STYLE

APA

Fengli, Z., & Xiaoli, L. (2019). Resource Efficiency Optimization for Big Data Mining Algorithm with Multi-MapReduce Collaboration Scenario. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11645 LNAI, pp. 503–514). Springer Verlag. https://doi.org/10.1007/978-3-030-26766-7_46

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free