Apache Hadoop Yarn MapReduce job classification based on CPU utilization and performance evaluation on multi-cluster heterogeneous environment

3Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Recently it is observed that Yahoo, Facebook, mobile devices, sensors, scientific instruments, etc., are generating a huge amount of data. It is a challenge to store, manage, process, and analyze this data. Apache Hadoop Yarn is a framework which provides a solution for big data. In this paper, we have evaluated the performance of Apache Hadoop Yarn MapReduce jobs such as Pi, TeraGen, TeraSort, and Wordcount on single cluster node. After evaluating performance; jobs are classified into various classes like low CPU intensive job, high CPU intensive job based on CPU utilization (%). Based on the classification, Apache Hadoop Yarn MapReduce jobs executed on multi-cluster environment and evaluated performance. It is found that execution time has increased for low CPU intensive job and decreased for high CPU intensive job. Also, a total CPU time is decreased for low and high CPU intensive job. In addition, CPU Utilization is decreased for low CPU intensive job and increased for high CPU intensive job when number of nodes increased.

Cite

CITATION STYLE

APA

Mathiya, B. J., & Desai, V. L. (2016). Apache Hadoop Yarn MapReduce job classification based on CPU utilization and performance evaluation on multi-cluster heterogeneous environment. In Advances in Intelligent Systems and Computing (Vol. 408, pp. 35–44). Springer Verlag. https://doi.org/10.1007/978-981-10-0129-1_4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free