An in-memory-based big data analytics with two-level storage on private cloud

1Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

With growing capacity of main memory, in-memory big data management and processing is developing and being used in many big data applications. It supports interactive data analysis by improving I/O throughput. Memory-centric distributed file systems such as tachyon and in-memory data clustering framework like Apache Spark are being used in analytical problems where both speed and fault tolerance are mandatory. In order to achieve high-speed big data processing, we proposed a system design which involves two-tier storage architecture which is the combination of HDFS and in-memory-based file system tachyon. Also, our architecture involves Apache Spark, an open-source in-memory-based data processing tool to analyse the big data. In this framework we would utilise the main memory by integrating caching algorithm to improve the data processing time. As the experimental result, we would demonstrate the comparison between performance of traditional Hadoop MapReduce and this in-memory-based framework. In this paper, we survey the existing storage and computation infrastructures, their performance while integrating together and contribution of such infrastructures in solving many I/O intensive analytical issues.

Cite

CITATION STYLE

APA

Shekhar, N., & Pawar, A. (2017). An in-memory-based big data analytics with two-level storage on private cloud. In Advances in Intelligent Systems and Computing (Vol. 479, pp. 935–942). Springer Verlag. https://doi.org/10.1007/978-981-10-1708-7_109

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free