Entity resolution in texts using statistical learning and ontologies

Tadej Štajner; Dunja Mladenić

Conference Proceedings

Entity resolution in texts using statistical learning and ontologies

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2009) 5926 LNCS 91-104

DOI: 10.1007/978-3-642-10871-6_7

11Citations

13Readers

Get full text

Abstract

Ambiguities, which are inherently present in natural languages represent a challenge of determining the actual identities of entities mentioned in a document (e.g., Paris can refer to a city in France but it can also refer to a small city in Texas, USA or to a 1984 film directed by Wim Wenders having title Paris, Texas). Disambiguation is a problem that can be successfully solved by entity resolution methods. This paper studies various methods for estimating relatedness between entities, used in collective entity resolution. We define a unified entity resolution approach, capable of using implicit as well as explicit relatedness for collectively identifying in-text entities. As a relatedness measure, we propose a method, which expresses relatedness using the heterogeneous relations of a domain ontology. We also experiment with other relatedness measures, such as using statistical learning of co-occurrences of two entities or using content similarity between them. Evaluation on real data shows that the new methods for relatedness estimation give good results. © 2009 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Štajner, T., & Mladenić, D. (2009). Entity resolution in texts using statistical learning and ontologies. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5926 LNCS, pp. 91–104). https://doi.org/10.1007/978-3-642-10871-6_7

Entity resolution in texts using statistical learning and ontologies

Abstract

Author supplied keywords

Cite

Register to see more suggestions