Big data management processing with Hadoop MapReduce and spark technology: A comparison

Ankush Verma; Ashik Hussain Mansuri; Neelesh Jain

Conference Proceedings

Big data management processing with Hadoop MapReduce and spark technology: A comparison

2016 Symposium on Colossal Data Analysis and Networking, CDAN 2016 (2016)

DOI: 10.1109/CDAN.2016.7570891

57Citations

83Readers

Get full text

Abstract

Hadoop MapReduce is processed for analysis large volume of data through multiple nodes in parallel. However MapReduce has two function Map and Reduce, large data is stored through HDFS. Lack of facility involve in MapReduce so Spark is designed to run for real time stream data and for fast queries. Spark jobs perform work on Resilient Distributed Datasets and directed acyclic graph execution engine. In this paper, we extend Hadoop MapReduce working and Spark architecture with supporting kind of operation to perform. We also show the differences between Hadoop MapReduce and Spark through Map and Reduce phase individually.

Author supplied keywords

Cite

CITATION STYLE

APA

Verma, A., Mansuri, A. H., & Jain, N. (2016). Big data management processing with Hadoop MapReduce and spark technology: A comparison. In 2016 Symposium on Colossal Data Analysis and Networking, CDAN 2016. Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/CDAN.2016.7570891

Big data management processing with Hadoop MapReduce and spark technology: A comparison

Abstract

Author supplied keywords

Cite

Register to see more suggestions