Semface: Pre-training encoder and decoder with a semantic interface for neural machine translation

Shuo Ren; Long Zhou; Shujie Liu; Furu Wei; Ming Zhou; Shuai Ma

Conference Proceedings

Semface: Pre-training encoder and decoder with a semantic interface for neural machine translation

ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference (2021) 1 4518-4527

DOI: 10.18653/v1/2021.acl-long.348

13Citations

71Readers

Get full text

Abstract

While pre-training techniques are working very well in natural language processing, how to pre-train a decoder and effectively leverage it for neural machine translation (NMT) still remains a tricky issue. The main reason is that the cross-attention module between the encoder and decoder cannot be pre-trained, and the combined encoder-decoder model cannot work well in the fine-tuning stage because the inputs of the decoder cross-attention come from unknown encoder outputs. In this paper, we propose a better pre-training method for NMT by defining a semantic interface (SemFace) between the pre-trained encoder and the pre-trained decoder. Specifically, we propose two types of semantic interfaces, including CL-SemFace which regards cross-lingual embeddings as an interface, and VQ-SemFace which employs vector quantized embeddings to constrain the encoder outputs and decoder inputs in the same language-independent space. We conduct massive experiments on six supervised translation pairs and three unsupervised pairs. Experimental results demonstrate that our proposed SemFace can effectively connect the pre-trained encoder and decoder, and achieves significant improvement by 3.7 and 1.5 BLEU points on the two tasks respectively compared with previous pre-training-based NMT models.

Cite

CITATION STYLE

APA

Ren, S., Zhou, L., Liu, S., Wei, F., Zhou, M., & Ma, S. (2021). Semface: Pre-training encoder and decoder with a semantic interface for neural machine translation. In ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference (Vol. 1, pp. 4518–4527). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.acl-long.348

Semface: Pre-training encoder and decoder with a semantic interface for neural machine translation

Abstract

Cite

Register to see more suggestions