On the sparsity of neural machine translation models

7Citations
Citations of this article
81Readers
Mendeley users who have this article in their library.

Abstract

Modern neural machine translation (NMT) models employ a large number of parameters, which leads to serious over-parameterization and typically causes the underutilization of computational resources. In response to this problem, we empirically investigate whether the redundant parameters can be reused to achieve better performance. Experiments and analyses are systematically conducted on different datasets and NMT architectures. We show that: 1) the pruned parameters can be rejuvenated to improve the baseline model by up to +0.8 BLEU points; 2) the rejuvenated parameters are reallocated to enhance the ability of modeling low-level lexical information.

Cite

CITATION STYLE

APA

Wang, Y., Wang, L., Li, V. O. K., & Tu, Z. (2020). On the sparsity of neural machine translation models. In EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (pp. 1060–1066). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.emnlp-main.78

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free