Deep neural language models for machine translation

23Citations
Citations of this article
125Readers
Mendeley users who have this article in their library.

Abstract

Neural language models (NLMs) have been able to improve machine translation (MT) thanks to their ability to generalize well to long contexts. Despite recent successes of deep neural networks in speech and vision, the general practice in MT is to incorporate NLMs with only one or two hidden layers and there have not been clear results on whether having more layers helps. In this paper, we demonstrate that deep NLMs with three or four layers outperform those with fewer layers in terms of both the perplexity and the translation quality. We combine various techniques to successfully train deep NLMs that jointly condition on both the source and target contexts. When reranking nbest lists of a strong web-forum baseline, our deep models yield an average boost of 0.5 TER / 0.5 BLEU points compared to using a shallow NLM. Additionally, we adapt our models to a new sms-chat domain and obtain a similar gain of 1.0 TER / 0.5 BLEU points.1

Cite

CITATION STYLE

APA

Luong, M. T., Kayser, M., & Manning, C. D. (2015). Deep neural language models for machine translation. In CoNLL 2015 - 19th Conference on Computational Natural Language Learning, Proceedings (pp. 305–309). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/k15-1031

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free