Towards Inducing Long-Context Abilities in Multilingual Neural Machine Translation Models

1Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Neural Machine Translation (NMT) models have traditionally used Sinusoidal Positional Embeddings (PEs), which often struggle to capture long-range dependencies and are inefficient for handling extended context or document-level translation tasks. This work addresses the challenge of transitioning pre-trained NMT models from absolute Sinusoidal PEs to Relative PEs, such as ROPE and ALIBI, without compromising performance. We demonstrate that parameter-efficient finetuning, using only a small amount of high-quality data, can successfully facilitate this transition. Experimental results indicate that switching from Sinusoidal to Relative PEs results in competitive translation quality on sentence-level evaluation benchmarks. Additionally, models trained with ROPE consistently outperform those using ALIBI and Sinusoidal PEs on document-level benchmarks across both string-based metrics and qualitative evaluations. Moreover, we find that a small amount of long-context data in a few languages is sufficient for cross-lingual length generalization, thereby inducing long-context capabilities.

Cite

CITATION STYLE

APA

Gumma, V., Chitale, P. A., & Bali, K. (2025). Towards Inducing Long-Context Abilities in Multilingual Neural Machine Translation Models. In Proceedings of the 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies: Long Papers, NAACL-HLT 2025 (Vol. 1, pp. 7158–7170). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2025.naacl-long.366

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free