Supporting real-time analytic queries in big and fast data environments

3Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Recently there has been a significant interest to perform realtime analytical queries in systems that can handle both “big data” and “fast data”. In this paper, we propose an approximate answering approach, called ROSE, which can manage the big and fast data streams and support complex analytical queries against the data streams. To achieve this goal, we start with an analysis of existing query processing techniques in big data systems to understand the requirements of building a distributed analytic sketch. We then propose a sampling-based sketch that can extract multi-faced samples from asynchronous data streams, and augment its usability with accuracy-lossless distributed sketch construction operations, such as splitting, merging and union. The experimental results with real-world data sets indicate that compared with state-of-the-art approximate answering engine BlinkDB, our techniques can obtain more accurate estimates and improve 2 times of system throughput. When compared with distributed memory-computing system Spark, our system can achieve 2 orders of magnitude improvement on query response time.

Cite

CITATION STYLE

APA

Wu, G., Yun, X., Li, C., Wang, S., Wang, Y., Zhang, X., … Zhang, G. (2017). Supporting real-time analytic queries in big and fast data environments. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10178 LNCS, pp. 477–493). Springer Verlag. https://doi.org/10.1007/978-3-319-55699-4_29

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free