Rethinking Document-level Neural Machine Translation

Zewei Sun; Mingxuan Wang; Hao Zhou; Chengqi Zhao; Shujian Huang; Jiajun Chen; Lei Li

Conference ProceedingsOPEN ACCESS

Rethinking Document-level Neural Machine Translation

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2022) 3537-3548

DOI: 10.18653/v1/2022.findings-acl.279

47Citations

48Readers

Abstract

This paper does not aim at introducing a novel model for document-level neural machine translation. Instead, we head back to the original Transformer model and hope to answer the following question: Is the capacity of current models strong enough for document-level translation? Interestingly, we observe that the original Transformer with appropriate training techniques can achieve strong results for document translation, even with a length of 2000 words. We evaluate this model and several recent approaches on nine document-level datasets and two sentence-level datasets across six languages. Experiments show that document-level Transformer models outperforms sentence-level ones and many previous methods in a comprehensive set of metrics, including BLEU, four lexical indices, three newly proposed assistant linguistic indicators, and human evaluation. Our new datasets and evaluation scripts are in https://github.com/sunzewei2715/Doc2Doc_NMT.

Cite

CITATION STYLE

APA

Sun, Z., Wang, M., Zhou, H., Zhao, C., Huang, S., Chen, J., & Li, L. (2022). Rethinking Document-level Neural Machine Translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 3537–3548). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-acl.279

Rethinking Document-level Neural Machine Translation

Abstract

Cite

Register to see more suggestions