Scalable Learning to Troubleshoot Query Performance Problems

Alexandar Mihaylov; Vincent Corvinelli; Parke Godfrey; Piotr Mierzejewski; Jaroslaw Szlichta; Calisto Zuzarte

Conference ProceedingsOPEN ACCESS

Scalable Learning to Troubleshoot Query Performance Problems

International Conference on Information and Knowledge Management, Proceedings (2021) 4016-4025

DOI: 10.1145/3459637.3481947

3Citations

6Readers

Get full text

Abstract

Query optimization has long been fundamental for database systems. There are cracks in the edifice, however, as the complexity of modern query workloads outpace what database systems can manage well. Automatic tools are needed for database vendors, such as IBM with Db2, to help customers troubleshoot their performance problems, as manual troubleshooting is painstaking. To manage complex and large workloads, we develop a distributed system called dGALO that learns recurring problem patterns in query plans over workloads. dGALO employs these problem patterns to build a RDF-based, SPARQL-queried knowledge-base of plan-rewrite remedies. We illustrate a distributed implementation of dGALO on Apache Spark with efficient partitioning strategies for load balancing. The system employs additional pruning strategies via clustering, which yields a fine-grained trade off between runtime and accuracy. dGALO uses its knowledge-base to re-optimize queries, often to dramatic effect, and is a valuable tool for the development team to refine the optimizer with new techniques. We demonstrate by an experimental study over the TPC-DS benchmark the efficiency and effectiveness of our techniques.

Author supplied keywords

Cite

CITATION STYLE

APA

Mihaylov, A., Corvinelli, V., Godfrey, P., Mierzejewski, P., Szlichta, J., & Zuzarte, C. (2021). Scalable Learning to Troubleshoot Query Performance Problems. In International Conference on Information and Knowledge Management, Proceedings (pp. 4016–4025). Association for Computing Machinery. https://doi.org/10.1145/3459637.3481947

Scalable Learning to Troubleshoot Query Performance Problems

Abstract

Author supplied keywords

Cite

Register to see more suggestions