Efficient partitioning and allocation of data for workload queries

Annamaria V. Kish; John R. Rose; Csilla Farkas

Journal Article

Efficient partitioning and allocation of data for workload queries

Lecture Notes in Electrical Engineering (2015) 313 549-555

DOI: 10.1007/978-3-319-06773-5_73

1Citations

4Readers

Get full text

Abstract

Our aim is to provide efficient partitioning and replication of data. We seek to accommodate a variety of transaction types (both short and long-running, read and write-oriented) to support workloads in cloud environments. We do so by introducing an approach that partitions and allocates small units of data, that we call micropartitions, to multiple database nodes. Only the necessary data is available to the workload in the form of micropartitions. Transactions are routed directly to the appropriate micropartitions. First, we use agglomerative hierarchical clustering technique to group the workload queries based on data requirements. We represent each cluster with an abstract query definition. The abstract query definition is a query statement that represents the minimal data requirements that would satisfy all the queries that belong to a given cluster. A micropartition is realized by executing the abstract query. We show that our abstract query definition is complete and minimal. Intuitively, completeness means that all queries of the corresponding cluster can be correctly answered using the micropartition generated from the abstract query. The minimality property means that no smaller partition of the data can satisfy all of the queries in the cluster. Our empirical results show that our approach improves data access efficiency over standard partitioning of data.

Author supplied keywords

Cite

CITATION STYLE

APA

Kish, A. V., Rose, J. R., & Farkas, C. (2015). Efficient partitioning and allocation of data for workload queries. Lecture Notes in Electrical Engineering, 313, 549–555. https://doi.org/10.1007/978-3-319-06773-5_73

Efficient partitioning and allocation of data for workload queries

Abstract

Author supplied keywords

Cite

Register to see more suggestions