BUNNI: Learning Repair Actions in Rule-driven Data Cleaning

8Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

Abstract

In this work, we address the challenging and open problem of involving non-expert users in the data repairing problem as first-class citizens. Despite a large number of proposals that have been devoted to cleaning data from the point of view of expert users (IT staff and data scientists), there is a lack of studies from the perspective of non-expert ones. Given a set of available data quality rules, we exploit machine learning techniques to guide the user to identify the dirty values for each violation and repair them. We show that with a low user effort, it is possible to identify the values in tuples that can be trusted and the ones that are most likely errors. We show experimentally how this machine learning approach leads to a unique clean solution with high quality in scenarios where other approaches fail.

Cite

CITATION STYLE

APA

Mecca, G., Papotti, P., Santoro, D., & Veltri, E. (2024). BUNNI: Learning Repair Actions in Rule-driven Data Cleaning. Journal of Data and Information Quality, 16(2). https://doi.org/10.1145/3665930

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free