Exploiting relationships for domain-independent data cleaning

100Citations
Citations of this article
57Readers
Mendeley users who have this article in their library.

Abstract

In this paper we address the problem of reference disambiguation. Specifically, we consider a situation where entities in the database are referred to using descriptions (e.g., a set of instantiated attributes). The objective of reference disambiguation is to identify the unique entity to which each description corresponds. The key difference between the approach we propose (called RelDC) and the traditional techniques is that RelDC analyzes not only object features but also inter-object relationships to improve the disambiguation quality. Our extensive experiments over two real datasets and also over synthetic datasets show that analysis of relationships significantly improves quality of the result. Copyright © by SIAM.

Cite

CITATION STYLE

APA

Kalashnikov, D. V., Mehrotra, S., & Chen, Z. (2005). Exploiting relationships for domain-independent data cleaning. In Proceedings of the 2005 SIAM International Conference on Data Mining, SDM 2005 (pp. 262–273). Society for Industrial and Applied Mathematics Publications. https://doi.org/10.1137/1.9781611972757.24

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free