ASAWA: An automatic partition key selection strategy

Xiaoyan Wang; Jinchuan Chen; Xiaoyong Du

Conference Proceedings

ASAWA: An automatic partition key selection strategy

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 7808 LNCS 609-620

DOI: 10.1007/978-3-642-37401-2_59

2Citations

1Readers

Get full text

Abstract

With the rapid increase of data volume, more and more applications have to be implemented in a distributed environment. In order to obtain high performance, we need to carefully divide the whole dataset into multiple partitions and put them into distributed data nodes. During this process, the selection of partition key would greatly affect the overall performance. Nevertheless, there are few works addressing this topic. Most previous projects on data partitioning either utilize a simple strategy, or rely on a commercial database system, to choose partition keys. In this work, we present an automatic partition key selection strategy called ASAWA. It chooses partition keys according to the analysis on both dataset and workload schemas. In this way, intimate tuples, i.e. co-appearing in queries frequently, would be probably put into the same partition. Hence the cross-node joins could be greatly reduced and the system performance could be improved. We conduct a series of experiments over the TPC-H datasets to illustrate the effectiveness of the ASAWA strategy. © 2013 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Wang, X., Chen, J., & Du, X. (2013). ASAWA: An automatic partition key selection strategy. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7808 LNCS, pp. 609–620). https://doi.org/10.1007/978-3-642-37401-2_59

ASAWA: An automatic partition key selection strategy

Abstract

Author supplied keywords

Cite

Register to see more suggestions