Abstract
The machine learning-based approaches that dominate natural language processing research require massive amounts of labeled training data. Active learning has the potential to substantially reduce the human effort needed to prepare this data by allowing annotators to focus on only the most informative training examples. This paper shows that active learning can be used for domain adaptation of dependency parsers, not just in single-domain settings. We also show that entropy-based query selection strategies can be combined with partial annotation to annotate informative examples in the new domain without annotating full sentences. Simulations are common in work on active learning, but we measured the actual time needed for manual annotation of data to better frame the results obtained in our simulations. We evaluate query strategies based on both full and partial annotation in several domains, and find that they reduce the amount of in-domain training data needed for domain adaptation by up to 75% compared to random selection. We found that partial annotation delivers better in-domain performance for the same amount of human effort than full annotation.
Cite
CITATION STYLE
Flannery, D., & Mori, S. (2015). Combining active learning and partial annotation for domain adaptation of a Japanese dependency parser. In IWPT 2015 - 14th International Conference on Parsing Technologies, Proceedings (pp. 11–19). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w15-2202
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.