A data cleaning framework based on user feedback

2Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we present our design of a data cleaning framework that combines interaction of data quality rules (CFDS, CINDS and MDs) with user feedback through an interactive process. First, to generate candidate repairs for each potentially dirty attribute, we propose an optimization model based on genetic algorithm. We then create a Bayesian machine learning model with several committees to predict the correctness of the repair and rank these repairs by uncertainly score to improve the learned model. User feedback is used to decide whether the model is accurate while inspecting the suggestions. Finally, our experiments on real-world datasets show significant improvement in data quality. © 2013 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Xie, H., Wang, H., Li, J., & Gao, H. (2013). A data cleaning framework based on user feedback. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7923 LNCS, pp. 514–520). Springer Verlag. https://doi.org/10.1007/978-3-642-38562-9_52

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free