Inclusion dependency discovery: An experimental evaluation of thirteen algorithms

Falco Dürsch; Axel Stebner; Fabian Windheuser; Maxi Fischer; Tim Friedrich; Nils Strelow; Tobias Bleifuß; Hazar Harmouch; Lan Jiang; Thorsten Papenbrock; Felix Naumann

Conference ProceedingsOPEN ACCESS

Inclusion dependency discovery: An experimental evaluation of thirteen algorithms

International Conference on Information and Knowledge Management, Proceedings (2019) 219-228

DOI: 10.1145/3357384.3357916

17Citations

18Readers

Get full text

Abstract

Inclusion dependencies are an important type of metadata in relational databases, because they indicate foreign key relationships and serve a variety of data management tasks, such as data linkage, query optimization, and data integration. The discovery of inclusion dependencies is, therefore, a well-studied problem and has been addressed by many algorithms. Each of these discovery algorithms follows its own strategy with certain strengths and weaknesses, which makes it difficult for data scientists to choose the optimal algorithm for a given profiling task. This paper summarizes the different state-of-the-art discovery approaches and discusses their commonalities. For evaluation purposes, we carefully re-implemented the thirteen most popular discovery algorithms and discuss their individual properties. Our extensive evaluation on several real-world and synthetic datasets shows the unbiased performance of the different discovery approaches and, hence, provides a guideline on when and where each approach works best. Comparing the different runtimes and scalability graphs, we identify the best approaches for certain situations and demonstrate where certain algorithms fail.

Author supplied keywords

Cite

CITATION STYLE

APA

Dürsch, F., Stebner, A., Windheuser, F., Fischer, M., Friedrich, T., Strelow, N., … Naumann, F. (2019). Inclusion dependency discovery: An experimental evaluation of thirteen algorithms. In International Conference on Information and Knowledge Management, Proceedings (pp. 219–228). Association for Computing Machinery. https://doi.org/10.1145/3357384.3357916

Inclusion dependency discovery: An experimental evaluation of thirteen algorithms

Abstract

Author supplied keywords

Cite

Register to see more suggestions