Abstract
We present a distributional approach to the problem of inducing parameters for unseen words in probabilistic parsers. Our KNN-based algorithm uses distributional similarity over an unlabelled corpus to match unseen words to the most similar seen words, and can induce parameters for those unseen words without retraining the parser. We apply this to domain adaptation for three different parsers that employ fine-grained syntactic categories, which allows us to focus on modifying the lexicon, while leaving the structure of the parser itself intact. We demonstrate uplifts for dependency recovery of 2%-6% on novel vocabulary in biomedical text.
Cite
CITATION STYLE
Mitchell, J., & Steedman, M. (2015). Parser Adaptation to the Biomedical Domain without Re-Training. In EMNLP 2015 - 6th International Workshop on Health Text Mining and Information Analysis, LOUHI 2015 - Proceedings of the Workshop (pp. 79–89). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w15-2610
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.