Grosbeak: A Data Warehouse Supporting Resource-Aware Incremental Computing

13Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.
Get full text

Abstract

As the primary approach to deriving decision-support insights, automated recurring routine analytic jobs account for a major part of cluster resource usages in modern enterprise data warehouses. These recurring routine jobs usually have stringent schedule and deadline determined by external business logic, and thus cause dreadful resource skew and severe resource over-provision in the cluster. In this paper, we present Grosbeak, a novel data warehouse that supports resource-aware incremental computing to process recurring routine jobs, smooths the resource skew, and optimizes the resource usage. Unlike batch processing in traditional data warehouses, Grosbeak leverages the fact that data is continuously ingested. It breaks an analysis job into small batches that incrementally process the progressively available data, and schedules these small-batch jobs intelligently when the cluster has free resources. In this demonstration, we showcase Grosbeak using real-world analysis pipelines. Users can interact with the data warehouse by registering recurring queries and observing the incremental scheduling behavior and smoothed resource usage pattern.

Cite

CITATION STYLE

APA

Wang, Z., Zeng, K., Huang, B., Chen, W., Cui, X., Wang, B., … Zhou, J. (2020). Grosbeak: A Data Warehouse Supporting Resource-Aware Incremental Computing. In Proceedings of the ACM SIGMOD International Conference on Management of Data (pp. 2797–2800). Association for Computing Machinery. https://doi.org/10.1145/3318464.3384708

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free