File system performance tuning for standard big data benchmarks

Da Qi Ren; Bing Xia

Conference ProceedingsOPEN ACCESS

File system performance tuning for standard big data benchmarks

ACM International Conference Proceeding Series (2018) Part F137704 22-26

DOI: 10.1145/3219788.3219809

2Citations

9Readers

Abstract

Modern file system manages super large data sets to perform data intensive and cost-effective analytical processing. Performance of a file system relies on storages, software, workload characteristic and configurations. Complex techniques have to be used in analysis because the data are often hybrid mix of different formats and different structured datasets. Performance study helps to optimize these factors and improve the design of a file system to meet the requirements of a specific application. A promising approach is to allocate the diverse data of various applications on different file systems according to their individual properties, in order to support the best possible performance to every particular application. Some basis that simulate the characters and scenarios of each step of data execution procedures are addressed in this paper. Based on workload characteristic analysis, administrator can implement tuning methods in the large and high-dimensional configuration parameter settings provided by the platform accordingly. Preliminary results are provided by running standard benchmark TPCx-HS, TPCx-BB, TPC-H and HiBench K-means on Ext4 and Btrfs file systems, and the impactions of workload characteristics to the benchmark performance have been analysed.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Ren, D. Q., & Xia, B. (2018). File system performance tuning for standard big data benchmarks. In ACM International Conference Proceeding Series (Vol. Part F137704, pp. 22–26). Association for Computing Machinery. https://doi.org/10.1145/3219788.3219809

Readers' Seniority

PhD / Post grad / Masters / Doc 5

83%

Researcher 1

17%

Readers' Discipline

Computer Science 7

88%

Chemical Engineering 1

13%

File system performance tuning for standard big data benchmarks

Abstract

Author supplied keywords

References Powered by Scopus

Comparative evaluation of big-data systems on scientific image analytics workloads

Introducing tpcx-hs: The first industry standard for benchmarking big data systems

Hadoop's adolescence; a comparative workloads analysis from three research clusters

Cited by Powered by Scopus

A Survey on Data-driven Performance Tuning for Big Data Analytics Platforms

Performance Comparison of Technological Solutions for Spark Applications in AWS

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline