Entity disambiguation with textual and connection information

4Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

Entity disambiguation is the task to resolve the underlying entity with the same surface form in the data. It arises from information integration, document retrieval, web search and many other applications. Based on the fact that entity occurring in most of the real world data possess both the textual information and the interobject relationship, we propose an unsupervised iterative similarity propagation algorithm to disambiguate entities. We first choose the entity pairs with the same surface form as the probable matching candidates, and construct a connection graph which take these probable matching pairs as nodes and built edges with the interobject relationship. Because the more similar textual information the two records in one probable pair possess, the greater possibility the two records correspond to the same real world entity. We use the textual similarity score as the initial value for our iterative method. Then the similarity of each entity pair is propagated based on the connection graph constructed. When the iteration is terminated, we identify the pairs whose final similarity scores are larger than a given threshold as the real match. The new method is applied to disambiguate authors in publication records. Experimental results on the real DBLP digital library data set demonstrate the effectiveness. © 2012 Published by Elsevier Ltd.

Cite

CITATION STYLE

APA

Niu, L., Wu, J., & Shi, Y. (2012). Entity disambiguation with textual and connection information. In Procedia Computer Science (Vol. 9, pp. 1249–1255). Elsevier B.V. https://doi.org/10.1016/j.procs.2012.04.136

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free