Regularization techniques for fine-tuning in neural machine translation

Antonio Valerio Miceli Barone; Barry Haddow; Ulrich Germann; Rico Sennrich

Conference ProceedingsOPEN ACCESS

Regularization techniques for fine-tuning in neural machine translation

EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings (2017) 1489-1494

DOI: 10.18653/v1/d17-1156

70Citations

198Readers

Abstract

We investigate techniques for supervised domain adaptation for neural machine translation where an existing model trained on a large out-of-domain dataset is adapted to a small in-domain dataset. In this scenario, overfitting is a major challenge. We investigate a number of techniques to reduce overfitting and improve transfer learning, including regularization techniques such as dropout and L2-regularization towards an out-of-domain prior. In addition, we introduce tuneout, a novel regularization technique inspired by dropout. We apply these techniques, alone and in combination, to neural machine translation, obtaining improvements on IWSLT datasets for English→German and English→Russian. We also investigate the amounts of in-domain training data needed for domain adaptation in NMT, and find a logarithmic relationship between the amount of training data and gain in BLEU score.

Cite

CITATION STYLE

APA

Miceli Barone, A. V., Haddow, B., Germann, U., & Sennrich, R. (2017). Regularization techniques for fine-tuning in neural machine translation. In EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 1489–1494). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d17-1156

Regularization techniques for fine-tuning in neural machine translation

Abstract

Cite

Register to see more suggestions