Zero-shot neural machine translation (NMT) is a framework that uses source-pivot and target-pivot parallel data to train a source-target NMT system. An extension to zero-shot NMT is zero-resource NMT, which generates pseudo-parallel corpora using a zero-shot system and further trains the zero-shot system on that data. In this paper, we expand on zero-resource NMT by incorporating monolingual data in the pivot language into training; since the pivot language is usually the highest-resource language of the three, we expect monolingual pivot-language data to be most abundant. We propose methods for generating pseudo-parallel corpora using pivot-language monolingual data and for leveraging the pseudo-parallel corpora to improve the zero-shot NMT system. We evaluate these methods for a high-resource language pair (German-Russian) using English as the pivot. We show that our proposed methods yield consistent improvements over strong zero-shot and zero-resource baselines and even catch up to pivot-based models in BLEU (while not requiring the two-pass inference that pivot models require).
CITATION STYLE
Currey, A., & Heafield, K. (2019). Zero-resource neural machine translation with monolingual pivot data. In EMNLP-IJCNLP 2019 - Proceedings of the 3rd Workshop on Neural Generation and Translation (pp. 99–107). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d19-5610
Mendeley helps you to discover research relevant for your work.