Efficient partitioning and allocation of data for workload queries

1Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Our aim is to provide efficient partitioning and replication of data. We seek to accommodate a variety of transaction types (both short and long-running, read and write-oriented) to support workloads in cloud environments. We do so by introducing an approach that partitions and allocates small units of data, that we call micropartitions, to multiple database nodes. Only the necessary data is available to the workload in the form of micropartitions. Transactions are routed directly to the appropriate micropartitions. First, we use agglomerative hierarchical clustering technique to group the workload queries based on data requirements. We represent each cluster with an abstract query definition. The abstract query definition is a query statement that represents the minimal data requirements that would satisfy all the queries that belong to a given cluster. A micropartition is realized by executing the abstract query. We show that our abstract query definition is complete and minimal. Intuitively, completeness means that all queries of the corresponding cluster can be correctly answered using the micropartition generated from the abstract query. The minimality property means that no smaller partition of the data can satisfy all of the queries in the cluster. Our empirical results show that our approach improves data access efficiency over standard partitioning of data.

Cite

CITATION STYLE

APA

Kish, A. V., Rose, J. R., & Farkas, C. (2015). Efficient partitioning and allocation of data for workload queries. Lecture Notes in Electrical Engineering, 313, 549–555. https://doi.org/10.1007/978-3-319-06773-5_73

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free