Over the last few years, data size grew tremendously in size and thus data analytics is always geared towards low latency processing. Processing of Big Data using traditional methodologies is not cost effective and fast enough to meet the requirements. Existing socket based communication (TCP/IP) used in Hadoop causes performance bottleneck on the significant amount of data transfers through a multi-gigabit network fabric. To fulfill the emerging demands, the underlying design should be modified to make use of data centre’s powerful hardware. The proposed project include integration of Hadoop with remote direct memory access (RDMA).For data-intensive applications, network performance becomes key component as the amount of data being stored and replicated to HDFS increases. RDMA is implemented in a commodity hardware through software,namely, Soft-iWARP (Software- Internet Wide Area Protocol). Hadoop employs a Java-based network transport stack on top of the JVM. JVM introduces a significant amount of overhead to data processing capability of the native interfaces which constrains use of RDMA. The usage of plug-in library for data shuffling and merging part of Hadoop can take advantage of RDMA. An optimization for Hadoop in data shuffling part can be thus implemented.
CITATION STYLE
Vejesh, V., Reshma Nayar, G., & Sathyadevan, S. (2015). Optimization of Hadoop using software-internet wide area remote direct memory access protocol and unstructured data accelerator. In Advances in Intelligent Systems and Computing (Vol. 349, pp. 261–270). Springer Verlag. https://doi.org/10.1007/978-3-319-18473-9_26
Mendeley helps you to discover research relevant for your work.