Efficient language modeling with automatic relevance determination in recurrent neural networks

2Citations
Citations of this article
72Readers
Mendeley users who have this article in their library.

Abstract

Reduction of the number of parameters is one of the most important goals in Deep Learning. In this article we propose an adaptation of Doubly Stochastic Variational Inference for Automatic Relevance Determination (DSVIARD) for neural networks compression. We find this method to be especially useful in language modeling tasks, where large number of parameters in the input and output layers is often excessive. We also show that DSVI-ARD can be applied together with encoder-decoder weight tying allowing to achieve even better sparsity and performance. Our experiments demonstrate that more than 90% of the weights in both encoder and decoder layers can be removed with a minimal quality loss.

Cite

CITATION STYLE

APA

Kodryan, M., Grachev, A., Ignatov, D., & Vetrov, D. (2019). Efficient language modeling with automatic relevance determination in recurrent neural networks. In ACL 2019 - 4th Workshop on Representation Learning for NLP, RepL4NLP 2019 - Proceedings of the Workshop (pp. 40–48). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w19-4306

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free