An in-memory-based big data analytics with two-level storage on private cloud

Nikkita Shekhar; Ambika Pawar

Conference Proceedings

An in-memory-based big data analytics with two-level storage on private cloud

Advances in Intelligent Systems and Computing (2017) 479 935-942

DOI: 10.1007/978-981-10-1708-7_109

1Citations

8Readers

Get full text

Abstract

With growing capacity of main memory, in-memory big data management and processing is developing and being used in many big data applications. It supports interactive data analysis by improving I/O throughput. Memory-centric distributed file systems such as tachyon and in-memory data clustering framework like Apache Spark are being used in analytical problems where both speed and fault tolerance are mandatory. In order to achieve high-speed big data processing, we proposed a system design which involves two-tier storage architecture which is the combination of HDFS and in-memory-based file system tachyon. Also, our architecture involves Apache Spark, an open-source in-memory-based data processing tool to analyse the big data. In this framework we would utilise the main memory by integrating caching algorithm to improve the data processing time. As the experimental result, we would demonstrate the comparison between performance of traditional Hadoop MapReduce and this in-memory-based framework. In this paper, we survey the existing storage and computation infrastructures, their performance while integrating together and contribution of such infrastructures in solving many I/O intensive analytical issues.

Author supplied keywords

Cite

CITATION STYLE

APA

Shekhar, N., & Pawar, A. (2017). An in-memory-based big data analytics with two-level storage on private cloud. In Advances in Intelligent Systems and Computing (Vol. 479, pp. 935–942). Springer Verlag. https://doi.org/10.1007/978-981-10-1708-7_109

An in-memory-based big data analytics with two-level storage on private cloud

Abstract

Author supplied keywords

Cite

Register to see more suggestions