Using target-side monolingual data for neural machine translation through multi-task learning

Tobias Domhan; Felix Hieber

Conference ProceedingsOPEN ACCESS

Using target-side monolingual data for neural machine translation through multi-task learning

EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings (2017) 1500-1505

DOI: 10.18653/v1/d17-1158

73Citations

157Readers

Abstract

The performance of Neural Machine Translation (NMT) models relies heavily on the availability of sufficient amounts of parallel data, and an efficient and effective way of leveraging the vastly available amounts of monolingual data has yet to be found. We propose to modify the decoder in a neural sequence-to-sequence model to enable multi-task learning for two strongly related tasks: target-side language modeling and translation. The decoder predicts the next target word through two channels, a target-side language model on the lowest layer, and an attentional recurrent model which is conditioned on the source representation. This architecture allows joint training on both large amounts of monolingual and moderate amounts of bilingual data to improve NMT performance. Initial results in the news domain for three language pairs show moderate but consistent improvements over a baseline trained on bilingual data only.

Cite

CITATION STYLE

APA

Domhan, T., & Hieber, F. (2017). Using target-side monolingual data for neural machine translation through multi-task learning. In EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 1500–1505). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d17-1158

Using target-side monolingual data for neural machine translation through multi-task learning

Abstract

Cite

Register to see more suggestions