Abstract
During the past decades, due to the lack of sufficient labeled data, most studies on cross-domain parsing focus on unsupervised domain adaptation, assuming there is no target-domain training data. However, unsupervised approaches make limited progress so far due to the intrinsic difficulty of both domain adaptation and parsing. This paper tackles the semi-supervised domain adaptation problem for Chinese dependency parsing, based on two newly-annotated large-scale domain-specific datasets.1 We propose a simple domain embedding approach to merge the source- and target-domain training data, which is shown to be more effective than both direct corpus concatenation and multi-task learning. In order to utilize unlabeled target-domain data, we employ the recent contextualized word representations and show that a simple fine-tuning procedure can further boost cross-domain parsing accuracy by large margins.
Cite
CITATION STYLE
Li, Z., Peng, X., Zhang, M., Wang, R., & Si, L. (2020). Semi-supervised domain adaptation for dependency parsing. In ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (pp. 2386–2395). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p19-1229
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.