Delay Scheduling Based Replication Scheme for Hadoop Distributed File System

undefined; S. Suresh; N.P. Gopalan

Journal ArticleOPEN ACCESS

Delay Scheduling Based Replication Scheme for Hadoop Distributed File System

Suresh S
Gopalan N

International Journal of Information Technology and Computer Science (2015) 7(4) 73-78

DOI: 10.5815/ijitcs.2015.04.08

N/ACitations

6Readers

Abstract

The data generated and processed by modern computing systems burgeon rapidly. MapReduce is an important programming model for large scale data intensive applications. Hadoop is a popular open source implementation of MapReduce and Google File System (GFS). The scalability and fault-tolerance feature of Hadoop makes it as a standard for BigData processing. Hadoop uses Hadoop Distributed File System (HDFS) for storing data. Data reliability and fault-tolerance is achieved through replication in HDFS. In this paper, a new technique called Delay Scheduling Based Replication Algorithm (DSBRA) is proposed to identify and replicate (dereplicate) the popular (unpopular) files/blocks in HDFS based on the information collected from the scheduler. Experimental results show that, the proposed method achieves 13% and 7% improvements in response time and locality over existing algorithms respectively.

Cite

CITATION STYLE

APA

Suresh, S., & Gopalan, N. P. (2015). Delay Scheduling Based Replication Scheme for Hadoop Distributed File System. International Journal of Information Technology and Computer Science, 7(4), 73–78. https://doi.org/10.5815/ijitcs.2015.04.08

Delay Scheduling Based Replication Scheme for Hadoop Distributed File System

Abstract

Cite

Register to see more suggestions