Filtering Back-Translated Data in Unsupervised Neural Machine Translation

11Citations
Citations of this article
56Readers
Mendeley users who have this article in their library.

Abstract

Unsupervised neural machine translation (NMT) utilizes only monolingual data for training. The quality of back-translated data plays an important role in the performance of NMT systems. In back-translation, all generated pseudo parallel sentence pairs are not of the same quality. Taking inspiration from domain adaptation where in-domain sentences are given more weight in training, in this paper we propose an approach to filter back-translated data as part of the training process of unsupervised NMT. Our approach gives more weight to good pseudo parallel sentence pairs in the back-translation phase. We calculate the weight of each pseudo parallel sentence pair using sentence-wise round-trip BLEU score which is normalized batch-wise. We compare our approach with the current state of the art approaches for unsupervised NMT.

Cite

CITATION STYLE

APA

Khatri, J., & Bhattacharyya, P. (2020). Filtering Back-Translated Data in Unsupervised Neural Machine Translation. In COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference (pp. 4334–4339). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.coling-main.383

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free