A2R2: Robust Unsupervised Neural Machine Translation with Adversarial Attack and Regularization on Representations

4Citations
Citations of this article
20Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Unsupervised neural machine translation (UNMT) has recently achieved significant progress without requirement on any parallel data. The models for UNMT are typically the sequence-to-sequence architecture with an encoder to map sentences in different languages to a shared latent space, and a decoder to generate their corresponding translation. Denoising autoencoding and back-translation are called in every iteration for the models to learn the relationship of sentence pairs in languages or between languages. However, sentences generated by the noise model of autoencoding or the reverse model of back-translation are normally different from those written by humans, which may cause inference bias. In this paper, we propose a regularization method for back-translation to explicitly draw representations of sentence pairs closer in the shared space. To enhance the robustness to sentences after autoencoding or back-translation, the adversarial attack on representations is applied. Experiments on unsupervised English French, English German and English Romanian benchmarks show that our approach outperforms the cross-lingual language model (XLM) baseline by 0.4∼ 1.8 BLEU scores. Additionally, the boost on noisy test sets in most translation directions is over 5 BLEU scores.

Cite

CITATION STYLE

APA

Yu, H., Luo, H., Yi, Y., & Cheng, F. (2021). A2R2: Robust Unsupervised Neural Machine Translation with Adversarial Attack and Regularization on Representations. IEEE Access, 9, 19990–19998. https://doi.org/10.1109/ACCESS.2021.3054935

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free