HADOOP: A Comparative Study between Single-Node and Multi-Node Cluster

6Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

Abstract

Data analysis has become a challenge in recent years as the volume of data generated has become difficult to manage, therefore more hardware and software resources are needed to store and process this huge amount of data. Apache Hadoop is a free framework, widely used thanks to the Hadoop Distributed Files System (HDFS) and its ability to relate to other data processing and analysis components such as MapReduce for processing data, Spark - in-memory Data Processing, Apache Drill - SQL on Hadoop, and many other. In this paper, we analyze the Hadoop framework implementation making a comparative study between Single-node and Multi-node cluster on Hadoop. We will explain in detail the two layers at the base of the Hadoop architecture: HDFS Layer with its deamons NameNode, Secondary NameNode, DataNodes and MapReuce Layer with JobTrackers, TaskTrackers daemons. This work is part of a complex one aiming to perform data processing in Data Lake structures.

Cite

CITATION STYLE

APA

Zagan, E., & Danubianu, M. (2021). HADOOP: A Comparative Study between Single-Node and Multi-Node Cluster. International Journal of Advanced Computer Science and Applications, 12(2), 53–58. https://doi.org/10.14569/IJACSA.2021.0120207

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free