Supporting real-time analytic queries in big and fast data environments

Guangjun Wu; Xiaochun Yun; Chao Li; Shupeng Wang; Yipeng Wang; Xiaoyu Zhang; Siyu Jia; Guangyan Zhang

Conference Proceedings

Supporting real-time analytic queries in big and fast data environments

Wu G
Yun X
Li C
et al.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10178 LNCS 477-493

DOI: 10.1007/978-3-319-55699-4_29

3Citations

4Readers

Get full text

Abstract

Recently there has been a significant interest to perform realtime analytical queries in systems that can handle both “big data” and “fast data”. In this paper, we propose an approximate answering approach, called ROSE, which can manage the big and fast data streams and support complex analytical queries against the data streams. To achieve this goal, we start with an analysis of existing query processing techniques in big data systems to understand the requirements of building a distributed analytic sketch. We then propose a sampling-based sketch that can extract multi-faced samples from asynchronous data streams, and augment its usability with accuracy-lossless distributed sketch construction operations, such as splitting, merging and union. The experimental results with real-world data sets indicate that compared with state-of-the-art approximate answering engine BlinkDB, our techniques can obtain more accurate estimates and improve 2 times of system throughput. When compared with distributed memory-computing system Spark, our system can achieve 2 orders of magnitude improvement on query response time.

Author supplied keywords

Cite

CITATION STYLE

APA

Wu, G., Yun, X., Li, C., Wang, S., Wang, Y., Zhang, X., … Zhang, G. (2017). Supporting real-time analytic queries in big and fast data environments. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10178 LNCS, pp. 477–493). Springer Verlag. https://doi.org/10.1007/978-3-319-55699-4_29

Supporting real-time analytic queries in big and fast data environments

Abstract

Author supplied keywords

Cite

Register to see more suggestions