Why Does Zero-Shot Cross-Lingual Generation Fail? An Explanation and a Solution

Tianjian Li; Kenton Murray

Conference Proceedings

Why Does Zero-Shot Cross-Lingual Generation Fail? An Explanation and a Solution

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2023) 12461-12476

DOI: 10.18653/v1/2023.findings-acl.789

21Citations

21Readers

Get full text

Abstract

Zero-shot cross-lingual transfer is when a multilingual model is trained to perform a task in one language and then is applied to another language. Although the zero-shot cross-lingual transfer approach has achieved success in various classification tasks (Wu and Dredze, 2019), its performance on natural language generation tasks falls short in quality (Rönnqvist et al., 2019; Vu et al., 2022) and sometimes outputs an incorrect language (Xue et al., 2021). In our study, we show that the fine-tuning process learns language invariant representations, which is beneficial for classification tasks but harmful for generation tasks. Motivated by this, we propose a simple method to regularize the model from learning language invariant representations and a method to select model checkpoints without a development set in the target language, both resulting in better generation quality. Experiments on three semantically diverse generation tasks show that our method reduces the accidental translation problem by 68% and improves the ROUGE-L score (Lin, 2004) by 1.5 on average.

Cite

CITATION STYLE

APA

Li, T., & Murray, K. (2023). Why Does Zero-Shot Cross-Lingual Generation Fail? An Explanation and a Solution. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 12461–12476). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-acl.789

Why Does Zero-Shot Cross-Lingual Generation Fail? An Explanation and a Solution

Abstract

Cite

Register to see more suggestions