In neural machine translation, what does transfer learning transfer?

54Citations
Citations of this article
164Readers
Mendeley users who have this article in their library.

Abstract

Transfer learning improves quality for low-resource machine translation, but it is unclear what exactly it transfers. We perform several ablation studies that limit information transfer, then measure the quality impact across three language pairs to gain a black-box understanding of transfer learning. Word embeddings play an important role in transfer learning, particularly if they are properly aligned. Although transfer learning can be performed without embeddings, results are sub-optimal. In contrast, transferring only the embeddings but nothing else yields catastrophic results. We then investigate diagonal alignments with auto-encoders over real languages and randomly generated sequences, finding even randomly generated sequences as parents yield noticeable but smaller gains. Finally, transfer learning can eliminate the need for a warmup phase when training transformer models in high resource language pairs.

Cite

CITATION STYLE

APA

Aji, A. F., Bogoychev, N., Heafield, K., & Sennrich, R. (2020). In neural machine translation, what does transfer learning transfer? In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 7701–7710). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.acl-main.688

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free