Abstract
Zero-shot cross-lingual transfer is when a multilingual model is trained to perform a task in one language and then is applied to another language. Although the zero-shot cross-lingual transfer approach has achieved success in various classification tasks (Wu and Dredze, 2019), its performance on natural language generation tasks falls short in quality (Rönnqvist et al., 2019; Vu et al., 2022) and sometimes outputs an incorrect language (Xue et al., 2021). In our study, we show that the fine-tuning process learns language invariant representations, which is beneficial for classification tasks but harmful for generation tasks. Motivated by this, we propose a simple method to regularize the model from learning language invariant representations and a method to select model checkpoints without a development set in the target language, both resulting in better generation quality. Experiments on three semantically diverse generation tasks show that our method reduces the accidental translation problem by 68% and improves the ROUGE-L score (Lin, 2004) by 1.5 on average.
Cite
CITATION STYLE
Li, T., & Murray, K. (2023). Why Does Zero-Shot Cross-Lingual Generation Fail? An Explanation and a Solution. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 12461–12476). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-acl.789
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.