In this paper, we study the problem of assessing the quality of co-reference tuples extracted from multiple low-quality data sources and finding true values from them. It is a critical part of an effective data integration solution. In order to solve this problem, we first propose a model to specify the tuple quality. Then we present a framework to infer the tuple quality based on the concept of quality predicates. In particular, we propose an algorithm underlying the framework to find true values for each attribute. Last, we have conducted extensive experiments on real-life data to verify the effectiveness and efficiency of our methods.
CITATION STYLE
Xie, Z., Liu, Q., & Bao, Z. (2017). Sifting truths from multiple low-quality data sources. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10366 LNCS, pp. 74–81). Springer Verlag. https://doi.org/10.1007/978-3-319-63579-8_7
Mendeley helps you to discover research relevant for your work.