Conditional Random Fields (CRFs) have been widely used for information extraction from free texts as well as from semi-structured documents. Interesting entities in semi-structured domains are often consistently structured within a certain context or document. However, their actual compositions vary and are possibly inconsistent among different contexts. We present two collective information extraction approaches based on CRFs for exploiting these context-specific consistencies. The first approach extends linear-chain CRFs by additional factors specified by a classifier, which learns such consistencies during inference. In a second extended approach, we propose a variant of skip-chain CRFs, which enables the model to transfer long-range evidence about the consistency of the entities. The practical relevance of the presented work for real-world information extraction systems is highlighted in an empirical study. Both approaches achieve a considerable error reduction. © 2012 Springer-Verlag.
CITATION STYLE
Kluegl, P., Toepfer, M., Lemmerich, F., Hotho, A., & Puppe, F. (2012). Collective information extraction with context-specific consistencies. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7523 LNAI, pp. 728–743). https://doi.org/10.1007/978-3-642-33460-3_52
Mendeley helps you to discover research relevant for your work.