Improving the transformer translation model with document-level context

Jiacheng Zhang; Huanbo Luan; Maosong Sun; Fei Fei Zhai; Jingfang Xu; Min Zhang; Yang Liu

Conference ProceedingsOPEN ACCESS

Improving the transformer translation model with document-level context

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 (2018) 533-542

DOI: 10.18653/v1/d18-1049

179Citations

261Readers

Abstract

Although the Transformer translation model (Vaswani et al., 2017) has achieved state-of-the-art performance in a variety of translation tasks, how to use document-level context to deal with discourse phenomena problematic for Transformer still remains a challenge. In this work, we extend the Transformer model with a new context encoder to represent document-level context, which is then incorporated into the original encoder and decoder. As large-scale document-level parallel corpora are usually not available, we introduce a two-step training method to take full advantage of abundant sentence-level parallel corpora and limited document-level parallel corpora. Experiments on the NIST Chinese-English datasets and the IWSLT French-English datasets show that our approach improves over Transformer significantly. 1

Cite

CITATION STYLE

APA

Zhang, J., Luan, H., Sun, M., Zhai, F. F., Xu, J., Zhang, M., & Liu, Y. (2018). Improving the transformer translation model with document-level context. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 (pp. 533–542). Association for Computational Linguistics. https://doi.org/10.18653/v1/d18-1049

Improving the transformer translation model with document-level context

Abstract

Cite

Register to see more suggestions