The implications of diverse applications and scalable data sets in benchmarking big data systems

Zhen Jia; Runlin Zhou; Chunge Zhu; Lei Wang; Wanling Gao; Yingjie Shi; Jianfeng Zhan; Lixin Zhang

Conference Proceedings

The implications of diverse applications and scalable data sets in benchmarking big data systems

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8163 LNCS 44-59

DOI: 10.1007/978-3-642-53974-9_5

10Citations

16Readers

Get full text

Abstract

Now we live in an era of big data, and big data applications are becoming more and more pervasive. How to benchmark data center computer systems running big data applications (in short big data systems) is a hot topic. In this paper, we focus on measuring the performance impacts of diverse applications and scalable volumes of data sets on big data systems. For four typical data analysis applications - an important class of big data applications, we find two major results through experiments: first, the data scale has a significant impact on the performance of big data systems, so we must provide scalable volumes of data sets in big data benchmarks. Second, for the four applications, even all of them use the simple algorithms, the performance trends are different with increasing data scales, and hence we must consider not only variety of data sets but also variety of applications in benchmarking big data systems. © 2014 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Jia, Z., Zhou, R., Zhu, C., Wang, L., Gao, W., Shi, Y., … Zhang, L. (2014). The implications of diverse applications and scalable data sets in benchmarking big data systems. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8163 LNCS, pp. 44–59). Springer Verlag. https://doi.org/10.1007/978-3-642-53974-9_5

The implications of diverse applications and scalable data sets in benchmarking big data systems

Abstract

Author supplied keywords

Cite

Register to see more suggestions