Abstract
Bootstrapped pattern learning for entity extraction usually starts with seed entities and iteratively learns patterns and entities from unlabeled text. Patterns are scored by their ability to extract more positive entities and less negative entities. A problem is that due to the lack of labeled data, unlabeled entities are either assumed to be negative or are ignored by the existing pattern scoring measures. In this paper, we improve pattern scoring by predicting the labels of unlabeled entities. We use various unsupervised features based on contrasting domain-specific and general text, and exploiting distributional similarity and edit distances to learned entities. Our system outperforms existing pattern scoring algorithms for extracting drug-and-treatment entities from four medical forums.
Cite
CITATION STYLE
Gupta, S., & Manning, C. D. (2014). Improved pattern learning for bootstrapped entity extraction. In CoNLL 2014 - 18th Conference on Computational Natural Language Learning, Proceedings (pp. 98–108). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/w14-1611
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.