Benchmarking SQL-on-hadoop systems: TPC or not TPC?

Avrilia Floratou; Fatma Özcan; Berni Schiefer

Conference Proceedings

Benchmarking SQL-on-hadoop systems: TPC or not TPC?

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 8991 63-72

DOI: 10.1007/978-3-319-20233-4_7

5Citations

7Readers

Get full text

Abstract

Benchmarks are important tools to evaluate systems, as long as their results are transparent, reproducible and they are conducted with due diligence. Today, many SQL-on-Hadoop vendors use the data generators and the queries of existing TPC benchmarks, but fail to adhere to the rules, producing results that are not transparent. As the SQL-on- Hadoop movement continues to gain more traction, it is important to bring some order to this “wild west” of benchmarking. First, new rules and policies should be defined to satisfy the demands of the new generation SQL systems. The new benchmark evaluation schemes should be inexpensive, effective and open enough to embrace the variety of SQLon- Hadoop systems and their corresponding vendors. Second, adhering to the new standards requires industry commitment and collaboration. In this paper, we discuss the problems we observe in the current practices of benchmarking, and present our proposal for bringing standardization in the SQL-on-Hadoop space.

Author supplied keywords

Cite

CITATION STYLE

APA

Floratou, A., Özcan, F., & Schiefer, B. (2015). Benchmarking SQL-on-hadoop systems: TPC or not TPC? In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8991, pp. 63–72). Springer Verlag. https://doi.org/10.1007/978-3-319-20233-4_7

Benchmarking SQL-on-hadoop systems: TPC or not TPC?

Abstract

Author supplied keywords

Cite

Register to see more suggestions