Previous studies on Chinese semantic role labeling (SRL) have concentrated on a single semantically annotated corpus. But the training data of single corpus is often limited. Whereas the other existing semantically annotated corpora for Chinese SRL are scattered across different annotation frameworks. But still, Data sparsity remains a bottleneck. This situation calls for larger training datasets, or effective approaches which can take advantage of highly heterogeneous data. In this paper, we focus mainly on the latter, that is, to improve Chinese SRL by using heterogeneous corpora together. We propose a novel progressive learning model which augments the Progressive Neural Network with Gated Recurrent Adapters. The model can accommodate heterogeneous inputs and effectively transfer knowledge between them. We also release a new corpus, Chinese SemBank, for Chinese SRL1. Experiments on CPB 1.0 show that our model outperforms state-of-the-art methods.
Mendeley helps you to discover research relevant for your work.
CITATION STYLE
Xia, Q., Sha, L., Chang, B., & Sui, Z. (2017). A progressive learning approach to Chinese SRL using heterogeneous data. In ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) (Vol. 1, pp. 2069–2077). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/P17-1189