Dependency discovery in data quality

Daniele Barone; Fabio Stella; Carlo Batini

Conference ProceedingsOPEN ACCESS

Dependency discovery in data quality

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2010) 6051 LNCS 53-67

DOI: 10.1007/978-3-642-13094-6_6

18Citations

30Readers

Abstract

A conceptual framework for the automatic discovery of dependencies between data quality dimensions is described. Dependency discovery consists in recovering the dependency structure for a set of data quality dimensions measured on attributes of a database. This task is accomplished through the data mining methodology, by learning a Bayesian Network from a database. The Bayesian Network is used to analyze dependency between data quality dimensions associated with different attributes. The proposed framework is instantiated on a real world database. The task of dependency discovery is presented in the case when the following data quality dimensions are considered; accuracy, completeness, and consistency. The Bayesian Network model shows how data quality can be improved while satisfying budget constraints. © Springer-Verlag Berlin Heidelberg 2010.

Author supplied keywords

Cite

CITATION STYLE

APA

Barone, D., Stella, F., & Batini, C. (2010). Dependency discovery in data quality. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6051 LNCS, pp. 53–67). https://doi.org/10.1007/978-3-642-13094-6_6

Dependency discovery in data quality

Abstract

Author supplied keywords

Cite

Register to see more suggestions