In this paper we present the initial results of our work to execute BigBench on Spark. First, we evaluated the scalability behavior of the existing MapReduce implementation of BigBench. Next, we executed the group of 14 pure HiveQL queries on Spark SQL and compared the results with the respective Hive ones. Our experiments show that: (1) for both Hive and Spark SQL, BigBench queries perform with the increase of the data size on average better than the linear scaling behavior and (2) pure HiveQL queries perform faster on Spark SQL than on Hive.
CITATION STYLE
Ivanov, T., & Beer, M. G. (2016). Performance evaluation of spark SQL using bigbench. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10044, pp. 96–116). Springer Verlag. https://doi.org/10.1007/978-3-319-49748-8_6
Mendeley helps you to discover research relevant for your work.