Bi-Directional Neural Machine Translation with Synthetic Parallel Data

37Citations
Citations of this article
121Readers
Mendeley users who have this article in their library.

Abstract

Despite impressive progress in high-resource settings, Neural Machine Translation (NMT) still struggles in low-resource and out-of-domain scenarios, often failing to match the quality of phrase-based translation. We propose a novel technique that combines back-translation and multilingual NMT to improve performance in these difficult cases. Our technique trains a single model for both directions of a language pair, allowing us to back-translate source or target monolingual data without requiring an auxiliary model. We then continue training on the augmented parallel data, enabling a cycle of improvement for a single model that can incorporate any source, target, or parallel data to improve both translation directions. As a byproduct, these models can reduce training and deployment costs significantly compared to uni-directional models. Extensive experiments show that our technique outperforms standard back-translation in low-resource scenarios, improves quality on cross-domain tasks, and effectively reduces costs across the board.

Cite

CITATION STYLE

APA

Niu, X., Denkowski, M., & Carpuat, M. (2018). Bi-Directional Neural Machine Translation with Synthetic Parallel Data. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 84–91). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w18-2710

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free