Zero-resource neural machine translation with monolingual pivot data

23Citations
Citations of this article
80Readers
Mendeley users who have this article in their library.

Abstract

Zero-shot neural machine translation (NMT) is a framework that uses source-pivot and target-pivot parallel data to train a source-target NMT system. An extension to zero-shot NMT is zero-resource NMT, which generates pseudo-parallel corpora using a zero-shot system and further trains the zero-shot system on that data. In this paper, we expand on zero-resource NMT by incorporating monolingual data in the pivot language into training; since the pivot language is usually the highest-resource language of the three, we expect monolingual pivot-language data to be most abundant. We propose methods for generating pseudo-parallel corpora using pivot-language monolingual data and for leveraging the pseudo-parallel corpora to improve the zero-shot NMT system. We evaluate these methods for a high-resource language pair (German-Russian) using English as the pivot. We show that our proposed methods yield consistent improvements over strong zero-shot and zero-resource baselines and even catch up to pivot-based models in BLEU (while not requiring the two-pass inference that pivot models require).

Cite

CITATION STYLE

APA

Currey, A., & Heafield, K. (2019). Zero-resource neural machine translation with monolingual pivot data. In EMNLP-IJCNLP 2019 - Proceedings of the 3rd Workshop on Neural Generation and Translation (pp. 99–107). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d19-5610

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free