Why Does Zero-Shot Cross-Lingual Generation Fail? An Explanation and a Solution

21Citations
Citations of this article
21Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Zero-shot cross-lingual transfer is when a multilingual model is trained to perform a task in one language and then is applied to another language. Although the zero-shot cross-lingual transfer approach has achieved success in various classification tasks (Wu and Dredze, 2019), its performance on natural language generation tasks falls short in quality (Rönnqvist et al., 2019; Vu et al., 2022) and sometimes outputs an incorrect language (Xue et al., 2021). In our study, we show that the fine-tuning process learns language invariant representations, which is beneficial for classification tasks but harmful for generation tasks. Motivated by this, we propose a simple method to regularize the model from learning language invariant representations and a method to select model checkpoints without a development set in the target language, both resulting in better generation quality. Experiments on three semantically diverse generation tasks show that our method reduces the accidental translation problem by 68% and improves the ROUGE-L score (Lin, 2004) by 1.5 on average.

Cite

CITATION STYLE

APA

Li, T., & Murray, K. (2023). Why Does Zero-Shot Cross-Lingual Generation Fail? An Explanation and a Solution. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 12461–12476). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-acl.789

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free