Dynamic similarity-aware inverted indexing for real-time entity resolution

19Citations
Citations of this article
24Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Entity resolution is the process of identifying groups of records in a single or multiple data sources that represent the same real-world entity. It is an important tool in data de-duplication, in linking records across databases, and in matching query records against a database of existing entities. Most existing entity resolution techniques complete the resolution process offline and on static databases. However, real-world databases are often dynamic, and increasingly organizations need to resolve entities in real-time. Thus, there is a need for new techniques that facilitate working with dynamic databases in real-time. In this paper, we propose a dynamic similarity-aware inverted indexing technique (DySimII) that meets these requirements. We also propose a frequency-filtered indexing technique where only the most frequent attribute values are indexed. We experimentally evaluate our techniques on a large real-world voter database. The results show that when the index size grows no appreciable increase is found in the average record insertion time (around 0.1 msec) and in the average query time (less than 0.1 sec). We also find that applying the frequency-filtered approach reduces the index size with only a slight drop in recall. © Springer-Verlag 2013.

Cite

CITATION STYLE

APA

Ramadan, B., Christen, P., Liang, H., Gayler, R. W., & Hawking, D. (2013). Dynamic similarity-aware inverted indexing for real-time entity resolution. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7867 LNAI, pp. 47–58). https://doi.org/10.1007/978-3-642-40319-4_5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free