Distributed programming frameworks in cloud platforms

ISSN: 22773878
1Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.

Abstract

Cloud computing technology has enabled storage and analysis of large volumes of data or big data. With cloud computing, a new discipline in computer science known as Data Science came into existence. Data Science is an interdisciplinary field which includes statistics, machine learning, predictive analytics and deep learning. It is meant for extracting hidden patterns from big data. Since big data consumes more storage space that cannot be accommodated with traditional storage devices, cloud computing resources of Infrastructure as a Service (IaaS) is used. Therefore, big data and big data analytics cannot exist without cloud computing. Another important fact is that big data can be subjected to analytics for obtaining Business Intelligence (BI). This process needs distributed programming frameworks like Hadoop, Apache Spark, Apache Flink, Apache Storm and Apache Samza. Without thorough understanding about these frameworks that run in cloud platforms, it is difficult to use them appropriately. Therefore, this paper throws light into a comparative study of these frameworks and evaluation of Apache Flink and Apache Spark with an empirical study. TeraSort benchmark is used for experiments.

Cite

CITATION STYLE

APA

Patil, A. (2019). Distributed programming frameworks in cloud platforms. International Journal of Recent Technology and Engineering, 7(6), 611–619.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free