Abstract
Due to recent pretrained multilingual representation models, it has become feasible to exploit labeled data from one language to train a cross-lingual model that can then be applied to multiple new languages. In practice, however, we still face the problem of scarce labeled data, leading to subpar results. In this paper, we propose a novel data augmentation strategy for better cross-lingual natural language inference by enriching the data to reflect more diversity in a semantically faithful way. To this end, we propose two methods of training a generative model to induce synthesized examples, and then leverage the resulting data using an adversarial training regimen for more robustness. In a series of detailed experiments, we show that this fruitful combination leads to substantial gains in cross-lingual inference.
Cite
CITATION STYLE
Dong, X., Zhu, Y., Fu, Z., Xu, D., & de Melo, G. (2021). Data augmentation with adversarial training for cross-lingual NLI. In ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference (Vol. 1, pp. 5158–5167). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.acl-long.401
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.