Challenging SQL-on-Hadoop Performance with Apache Druid

4Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In Big Data, SQL-on-Hadoop tools usually provide satisfactory performance for processing vast amounts of data, although new emerging tools may be an alternative. This paper evaluates if Apache Druid, an innovative column-oriented data store suited for online analytical processing workloads, is an alternative to some of the well-known SQL-on-Hadoop technologies and its potential in this role. In this evaluation, Druid, Hive and Presto are benchmarked with increasing data volumes. The results point Druid as a strong alternative, achieving better performance than Hive and Presto, and show the potential of integrating Hive and Druid, enhancing the potentialities of both tools.

Cite

CITATION STYLE

APA

Correia, J., Costa, C., & Santos, M. Y. (2019). Challenging SQL-on-Hadoop Performance with Apache Druid. In Lecture Notes in Business Information Processing (Vol. 353, pp. 149–161). Springer Verlag. https://doi.org/10.1007/978-3-030-20485-3_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free