Delay Scheduling Based Replication Scheme for Hadoop Distributed File System

  • Suresh S
  • Gopalan N
N/ACitations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

The data generated and processed by modern computing systems burgeon rapidly. MapReduce is an important programming model for large scale data intensive applications. Hadoop is a popular open source implementation of MapReduce and Google File System (GFS). The scalability and fault-tolerance feature of Hadoop makes it as a standard for BigData processing. Hadoop uses Hadoop Distributed File System (HDFS) for storing data. Data reliability and fault-tolerance is achieved through replication in HDFS. In this paper, a new technique called Delay Scheduling Based Replication Algorithm (DSBRA) is proposed to identify and replicate (dereplicate) the popular (unpopular) files/blocks in HDFS based on the information collected from the scheduler. Experimental results show that, the proposed method achieves 13% and 7% improvements in response time and locality over existing algorithms respectively.

Cite

CITATION STYLE

APA

Suresh, S., & Gopalan, N. P. (2015). Delay Scheduling Based Replication Scheme for Hadoop Distributed File System. International Journal of Information Technology and Computer Science, 7(4), 73–78. https://doi.org/10.5815/ijitcs.2015.04.08

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free