Challenging SQL-on-Hadoop Performance with Apache Druid

José Correia; Carlos Costa; Maribel Yasmina Santos

Conference Proceedings

Challenging SQL-on-Hadoop Performance with Apache Druid

Lecture Notes in Business Information Processing (2019) 353 149-161

DOI: 10.1007/978-3-030-20485-3_12

4Citations

16Readers

Get full text

Abstract

In Big Data, SQL-on-Hadoop tools usually provide satisfactory performance for processing vast amounts of data, although new emerging tools may be an alternative. This paper evaluates if Apache Druid, an innovative column-oriented data store suited for online analytical processing workloads, is an alternative to some of the well-known SQL-on-Hadoop technologies and its potential in this role. In this evaluation, Druid, Hive and Presto are benchmarked with increasing data volumes. The results point Druid as a strong alternative, achieving better performance than Hive and Presto, and show the potential of integrating Hive and Druid, enhancing the potentialities of both tools.

Author supplied keywords

Cite

CITATION STYLE

APA

Correia, J., Costa, C., & Santos, M. Y. (2019). Challenging SQL-on-Hadoop Performance with Apache Druid. In Lecture Notes in Business Information Processing (Vol. 353, pp. 149–161). Springer Verlag. https://doi.org/10.1007/978-3-030-20485-3_12

Challenging SQL-on-Hadoop Performance with Apache Druid

Abstract

Author supplied keywords

Cite

Register to see more suggestions