Recently it is observed that Yahoo, Facebook, mobile devices, sensors, scientific instruments, etc., are generating a huge amount of data. It is a challenge to store, manage, process, and analyze this data. Apache Hadoop Yarn is a framework which provides a solution for big data. In this paper, we have evaluated the performance of Apache Hadoop Yarn MapReduce jobs such as Pi, TeraGen, TeraSort, and Wordcount on single cluster node. After evaluating performance; jobs are classified into various classes like low CPU intensive job, high CPU intensive job based on CPU utilization (%). Based on the classification, Apache Hadoop Yarn MapReduce jobs executed on multi-cluster environment and evaluated performance. It is found that execution time has increased for low CPU intensive job and decreased for high CPU intensive job. Also, a total CPU time is decreased for low and high CPU intensive job. In addition, CPU Utilization is decreased for low CPU intensive job and increased for high CPU intensive job when number of nodes increased.
CITATION STYLE
Mathiya, B. J., & Desai, V. L. (2016). Apache Hadoop Yarn MapReduce job classification based on CPU utilization and performance evaluation on multi-cluster heterogeneous environment. In Advances in Intelligent Systems and Computing (Vol. 408, pp. 35–44). Springer Verlag. https://doi.org/10.1007/978-981-10-0129-1_4
Mendeley helps you to discover research relevant for your work.