Incremental data fusion based on provenance information

Carmem Satie Hara; Cristina Dutra De Aguiar Ciferri; Ricardo Rodrigues Ciferri

Journal Article

Incremental data fusion based on provenance information

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 8000 339-365

DOI: 10.1007/978-3-642-41660-6_18

4Citations

7Readers

Get full text

Abstract

Data fusion is the process of combining multiple representations of the same object, extracted from several external sources, into a single and clean representation. It is usually the last step of an integration process, which is executed after the schema matching and the entity identification steps. More specifically, data fusion aims at solving attribute value conflicts based on user-defined rules. Although there exist several approaches in the literature for fusing data, few of them focus on optimizing the process when new versions of the sources become available. In this paper, we propose a model for incremental data fusion. Our approach is based on storing provenance information in the form of a sequence of operations. These operations reflect the last fusion rules applied on the imported data. By keeping both the original source value and the new fused data in the operations repository, we are able to reliably detect source value updates, and propagate them to the fusion process, which reapplies previously defined rules whenever it is possible. This approach reduces the number of data items affected by source updates and minimizes the amount of user manual intervention in future fusion processes. © 2013 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Hara, C. S., De Aguiar Ciferri, C. D., & Ciferri, R. R. (2013). Incremental data fusion based on provenance information. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8000, 339–365. https://doi.org/10.1007/978-3-642-41660-6_18

Incremental data fusion based on provenance information

Abstract

Cite

Register to see more suggestions