On the initialization of long short-term memory networks

Mostafa Mehdipour Ghazi; Mads Nielsen; Akshay Pai; Marc Modat; M. Jorge Cardoso; Sébastien Ourselin; Lauge Sørensen

Conference Proceedings

On the initialization of long short-term memory networks

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11953 LNCS 275-286

DOI: 10.1007/978-3-030-36708-4_23

5Citations

18Readers

Get full text

Abstract

Weight initialization is important for faster convergence and stability of deep neural networks training. In this paper, a robust initialization method is developed to address the training instability in long short-term memory (LSTM) networks. It is based on a normalized random initialization of the network weights that aims at preserving the variance of the network input and output in the same range. The method is applied to standard LSTMs for univariate time series regression and to LSTMs robust to missing values for multivariate disease progression modeling. The results show that in all cases, the proposed initialization method outperforms the state-of-the-art initialization techniques in terms of training convergence and generalization performance of the obtained solution.

Author supplied keywords

Cite

CITATION STYLE

APA

Mehdipour Ghazi, M., Nielsen, M., Pai, A., Modat, M., Cardoso, M. J., Ourselin, S., & Sørensen, L. (2019). On the initialization of long short-term memory networks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11953 LNCS, pp. 275–286). Springer. https://doi.org/10.1007/978-3-030-36708-4_23

On the initialization of long short-term memory networks

Abstract

Author supplied keywords

Cite

Register to see more suggestions