Deep neural language models for machine translation

Minh Thang Luong; Michael Kayser; Christopher D. Manning

Conference ProceedingsOPEN ACCESS

Deep neural language models for machine translation

CoNLL 2015 - 19th Conference on Computational Natural Language Learning, Proceedings (2015) 305-309

DOI: 10.18653/v1/k15-1031

23Citations

125Readers

Abstract

Neural language models (NLMs) have been able to improve machine translation (MT) thanks to their ability to generalize well to long contexts. Despite recent successes of deep neural networks in speech and vision, the general practice in MT is to incorporate NLMs with only one or two hidden layers and there have not been clear results on whether having more layers helps. In this paper, we demonstrate that deep NLMs with three or four layers outperform those with fewer layers in terms of both the perplexity and the translation quality. We combine various techniques to successfully train deep NLMs that jointly condition on both the source and target contexts. When reranking nbest lists of a strong web-forum baseline, our deep models yield an average boost of 0.5 TER / 0.5 BLEU points compared to using a shallow NLM. Additionally, we adapt our models to a new sms-chat domain and obtain a similar gain of 1.0 TER / 0.5 BLEU points.1

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Luong, M. T., Kayser, M., & Manning, C. D. (2015). Deep neural language models for machine translation. In CoNLL 2015 - 19th Conference on Computational Natural Language Learning, Proceedings (pp. 305–309). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/k15-1031

Readers' Seniority

PhD / Post grad / Masters / Doc 44

66%

Researcher 11

16%

Lecturer / Post doc 7

10%

Professor / Associate Prof. 5

Readers' Discipline

Computer Science 55

76%

Linguistics 11

15%

Engineering 3

Agricultural and Biological Sciences 3

Deep neural language models for machine translation

Abstract

References Powered by Scopus

Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups

Learning deep architectures for AI

A Neural Probabilistic Language Model

Cited by Powered by Scopus

Auditing data provenance in text-generation models

Neural Machine Translation

System Design Perspective for Human-Level Agents Using Deep Reinforcement Learning: A Survey

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline