Deep neural language models for machine translation

23Citations
Citations of this article
125Readers
Mendeley users who have this article in their library.

Abstract

Neural language models (NLMs) have been able to improve machine translation (MT) thanks to their ability to generalize well to long contexts. Despite recent successes of deep neural networks in speech and vision, the general practice in MT is to incorporate NLMs with only one or two hidden layers and there have not been clear results on whether having more layers helps. In this paper, we demonstrate that deep NLMs with three or four layers outperform those with fewer layers in terms of both the perplexity and the translation quality. We combine various techniques to successfully train deep NLMs that jointly condition on both the source and target contexts. When reranking nbest lists of a strong web-forum baseline, our deep models yield an average boost of 0.5 TER / 0.5 BLEU points compared to using a shallow NLM. Additionally, we adapt our models to a new sms-chat domain and obtain a similar gain of 1.0 TER / 0.5 BLEU points.1

References Powered by Scopus

Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups

8841Citations
N/AReaders
Get full text

Learning deep architectures for AI

6712Citations
N/AReaders
Get full text

A Neural Probabilistic Language Model

5170Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Auditing data provenance in text-generation models

149Citations
N/AReaders
Get full text

Neural Machine Translation

107Citations
N/AReaders
Get full text

System Design Perspective for Human-Level Agents Using Deep Reinforcement Learning: A Survey

81Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Luong, M. T., Kayser, M., & Manning, C. D. (2015). Deep neural language models for machine translation. In CoNLL 2015 - 19th Conference on Computational Natural Language Learning, Proceedings (pp. 305–309). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/k15-1031

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 44

66%

Researcher 11

16%

Lecturer / Post doc 7

10%

Professor / Associate Prof. 5

7%

Readers' Discipline

Tooltip

Computer Science 55

76%

Linguistics 11

15%

Engineering 3

4%

Agricultural and Biological Sciences 3

4%

Save time finding and organizing research with Mendeley

Sign up for free