An Empirical Study of Contextual Data Augmentation for Japanese Zero Anaphora Resolution

Ryuto Konno; Yuichiroh Matsubayashi; Shun Kiyono; Hiroki Ouchi; Ryo Takahashi; Kentaro Inui

Conference ProceedingsOPEN ACCESS

An Empirical Study of Contextual Data Augmentation for Japanese Zero Anaphora Resolution

COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference (2020) 4956-4968

DOI: 10.5715/jnlp.28.721

4Citations

70Readers

Abstract

One critical issue of zero anaphora resolution (ZAR) is the scarcity of labeled data. This study explores how effectively this problem can be alleviated by data augmentation. We adopt a state-of-the-art data augmentation method, called the contextual data augmentation (CDA), that generates labeled training instances using a pretrained language model. The CDA has been reported to work well for several other natural language processing tasks, including text classification and machine translation (Kobayashi, 2018; Wu et al., 2019; Gao et al., 2019). This study addresses two underexplored issues on CDA, that is, how to reduce the computational cost of data augmentation and how to ensure the quality of the generated data. We also propose two methods to adapt CDA to ZAR: [MASK]-based augmentation and linguistically-controlled masking. Consequently, the experimental results on Japanese ZAR show that our methods contribute to both the accuracy gain and the computation cost reduction. Our closer analysis reveals that the proposed method can improve the quality of the augmented training data when compared to the conventional CDA.

Cite

CITATION STYLE

APA

Konno, R., Matsubayashi, Y., Kiyono, S., Ouchi, H., Takahashi, R., & Inui, K. (2020). An Empirical Study of Contextual Data Augmentation for Japanese Zero Anaphora Resolution. In COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference (pp. 4956–4968). Association for Computational Linguistics (ACL). https://doi.org/10.5715/jnlp.28.721

An Empirical Study of Contextual Data Augmentation for Japanese Zero Anaphora Resolution

Abstract

Cite

Register to see more suggestions