In this paper, we examine the task of extracting a set of biographic facts about target individuals from a collection of Web pages. We automatically annotate training text with positive and negative examples of fact extractions and train Rote, Näive Bayes, and Conditional Random Field extraction models for fact extraction from individual Web pages. We then propose and evaluate methods for fusing the extracted information across documents to return a consensus answer. A novel cross-field bootstrapping method leverages data interdependencies to yield improved performance. © 2005 Association for Computational Linguistics.
CITATION STYLE
Mann, G. S., & Yarowsky, D. (2005). Multi-field information extraction and cross-document fusion. In ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (pp. 483–490). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1219840.1219900
Mendeley helps you to discover research relevant for your work.