Multilingual Translation from Denoising Pre-Training

Yuqing Tang; Chau Tran; Xian Li; Peng Jen Chen; Naman Goyal; Vishrav Chaudhary; Jiatao Gu; Angela Fan

Conference ProceedingsOPEN ACCESS

Multilingual Translation from Denoising Pre-Training

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (2021) 3450-3466

DOI: 10.18653/v1/2021.findings-acl.304

111Citations

65Readers

Abstract

Recent work demonstrates the potential of training one model for multilingual machine translation. In parallel, denoising pretraining using unlabeled monolingual data as a starting point for finetuning bitext machine translation systems has demonstrated strong performance gains. However, little has been explored on the potential to combine denoising pretraining with multilingual machine translation in a single model. In this work, we fill this gap by studying how multilingual translation models can be created through multilingual finetuning. Fintuning multilingual model from a denoising pretrained model incorporates the benefits of large quantities of unlabeled monolingual data, which is particularly important for low resource languages where bitext is rare. Further, we create the ML50 benchmark to facilitate reproducible research by standardizing training and evaluation data. On ML50, we show that multilingual finetuning significantly improves over multilingual models trained from scratch and bilingual finetuning for translation into English. We also find that multilingual finetuning can significantly improve over multilingual models trained from scratch for zero-shot translation on non-English directions. Finally, we discuss that the pretraining and finetuning paradigm alone is not enough to address the challenges of multilingual models for to-Many directions performance.

Cite

CITATION STYLE

APA

Tang, Y., Tran, C., Li, X., Chen, P. J., Goyal, N., Chaudhary, V., … Fan, A. (2021). Multilingual Translation from Denoising Pre-Training. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 3450–3466). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.304

Multilingual Translation from Denoising Pre-Training

Abstract

Cite

Register to see more suggestions