File system performance tuning for standard big data benchmarks

2Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

Modern file system manages super large data sets to perform data intensive and cost-effective analytical processing. Performance of a file system relies on storages, software, workload characteristic and configurations. Complex techniques have to be used in analysis because the data are often hybrid mix of different formats and different structured datasets. Performance study helps to optimize these factors and improve the design of a file system to meet the requirements of a specific application. A promising approach is to allocate the diverse data of various applications on different file systems according to their individual properties, in order to support the best possible performance to every particular application. Some basis that simulate the characters and scenarios of each step of data execution procedures are addressed in this paper. Based on workload characteristic analysis, administrator can implement tuning methods in the large and high-dimensional configuration parameter settings provided by the platform accordingly. Preliminary results are provided by running standard benchmark TPCx-HS, TPCx-BB, TPC-H and HiBench K-means on Ext4 and Btrfs file systems, and the impactions of workload characteristics to the benchmark performance have been analysed.

References Powered by Scopus

Comparative evaluation of big-data systems on scientific image analytics workloads

40Citations
N/AReaders
Get full text

Introducing tpcx-hs: The first industry standard for benchmarking big data systems

23Citations
N/AReaders
Get full text

Hadoop's adolescence; a comparative workloads analysis from three research clusters

18Citations
N/AReaders
Get full text

Cited by Powered by Scopus

A Survey on Data-driven Performance Tuning for Big Data Analytics Platforms

17Citations
N/AReaders
Get full text

Performance Comparison of Technological Solutions for Spark Applications in AWS

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Ren, D. Q., & Xia, B. (2018). File system performance tuning for standard big data benchmarks. In ACM International Conference Proceeding Series (Vol. Part F137704, pp. 22–26). Association for Computing Machinery. https://doi.org/10.1145/3219788.3219809

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 5

83%

Researcher 1

17%

Readers' Discipline

Tooltip

Computer Science 7

88%

Chemical Engineering 1

13%

Save time finding and organizing research with Mendeley

Sign up for free