Abstract
The task of identifying synonymous relations and objects, or synonym resolution, is critical for high-quality information extraction. This paper investigates synonym resolution in the context of unsupervised information extraction, where neither hand-tagged training examples nor domain knowledge is available. The paper presents a scalable, fully- implemented system that runs in 0(KN log N) time in the number of extractions. N, and the maximum number of synonyms per word, K. The system, called resolver, introduces a probabilistic relational model for predicting whether two strings are co-referential based on the similarity of the assertions containing them. On a set of two million assertions extracted from the Web, resolver resolves objects with 78% precision and 68% recall, and resolves relations with 90% precision and 35% recall. Several variations of Resolver.'s probabilistic model are explored, and experiments demonstrate that under appropriate conditions these variations can improve Fl by 5%. An extension to the basic Resolver. system allows it to handle polysemous names with 97% precision and 95% recall on a data set from the TREC corpus. ©2009 AI Access Foundation. All rights reserved.
Cite
CITATION STYLE
Yates, A., & Etzioni, O. (2009). Unsupervised methods for determining object and relation synonyms on the web. Journal of Artificial Intelligence Research, 34, 255–296. https://doi.org/10.1613/jair.2772
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.