Grouping-aware data placement in HDFS for data-intensive applications based on graph clustering

2Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The time taken to execute a query and return the results, increase exponentially as the data size increases, leading to more waiting times of the user. Hadoop with its distributed processing capability can be considered as an efficient solution for processing such large data. Hadoop’s default data placement strategy (HDDPS) places the data blocks randomly across the cluster of nodes without considering any of the execution parameters. Also, it is commonly observed that most of the data-intensive applications show grouping semantics. During any query execution only a part of the big data set is utilized. Since such grouping behavior is not considered, the default placement does not perform well, leading to increased execution time, query latency, etc. Hence an optimal data placement strategy based on grouping semantics is proposed. Initially by analyzing the user history log, the access pattern is identified and depicted as an execution graph. By applying Markov clustering algorithm, grouping pattern of the data is identified. Then optimal data placement algorithm based on statistical measures is proposed, which re-organizes the default data layouts in HDFS. This in turn increases parallel execution, resulting in improved data locality and reduced query execution time compared to HDDPS. The experimental results have strengthened the proposed algorithm and has proved to be more efficient for Big-Data sets to be processed in hetrogenous distributed environment.

Cite

CITATION STYLE

APA

Vengadeswaran, S., & Balasundaram, S. R. (2018). Grouping-aware data placement in HDFS for data-intensive applications based on graph clustering. In Advances in Intelligent Systems and Computing (Vol. 554, pp. 21–31). Springer Verlag. https://doi.org/10.1007/978-981-10-3773-3_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free