Locality aware mapreduce

3Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The large amount of data produced need to be processed properly. This can be done using Apache Hadoop, which is an open source software library. HDFS and MapReduce are the two core components of Apache Hadoop. The overall performance can be increased by improving the performance of MapReduce. A Locality Aware MapReduce idea is introduced here. It includes an input splitting strategy and also a MapReduce scheduling algorithm. Both of them are based on the locality of data. The input splitting, called the Improved Input Splitting, which works based on locality, clusters data blocks from a same node into the same single split, so that it is processed by one map task. In the scheduling algorithm, to assign tasks to a node, local map tasks are always preferred over non-local map tasks, no matter which job a task belongs to. That is, here the algorithm performs scheduling by checking for a local data when a free slot is available. Non-local data is always given a second preference. Since the scheduling is done based on locality it is called Locality Aware Scheduling. Each of these methods, when executed separately and combined showed a better performance than the one without any modification.

Author supplied keywords

Cite

CITATION STYLE

APA

Rhine, R., & Bhuvan, N. T. (2016). Locality aware mapreduce. In Advances in Intelligent Systems and Computing (Vol. 424, pp. 221–228). Springer Verlag. https://doi.org/10.1007/978-3-319-28031-8_19

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free