Ambiguities, which are inherently present in natural languages represent a challenge of determining the actual identities of entities mentioned in a document (e.g., Paris can refer to a city in France but it can also refer to a small city in Texas, USA or to a 1984 film directed by Wim Wenders having title Paris, Texas). Disambiguation is a problem that can be successfully solved by entity resolution methods. This paper studies various methods for estimating relatedness between entities, used in collective entity resolution. We define a unified entity resolution approach, capable of using implicit as well as explicit relatedness for collectively identifying in-text entities. As a relatedness measure, we propose a method, which expresses relatedness using the heterogeneous relations of a domain ontology. We also experiment with other relatedness measures, such as using statistical learning of co-occurrences of two entities or using content similarity between them. Evaluation on real data shows that the new methods for relatedness estimation give good results. © 2009 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Štajner, T., & Mladenić, D. (2009). Entity resolution in texts using statistical learning and ontologies. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5926 LNCS, pp. 91–104). https://doi.org/10.1007/978-3-642-10871-6_7
Mendeley helps you to discover research relevant for your work.