Learning with Random Learning Rates

5Citations
Citations of this article
102Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In neural network optimization, the learning rate of the gradient descent strongly affects performance. This prevents reliable out-of-the-box training of a model on a new problem. We propose the All Learning Rates At Once (Alrao) algorithm for deep learning architectures: each neuron or unit in the network gets its own learning rate, randomly sampled at startup from a distribution spanning several orders of magnitude. The network becomes a mixture of slow and fast learning units. Surprisingly, Alrao performs close to SGD with an optimally tuned learning rate, for various tasks and network architectures. In our experiments, all Alrao runs were able to learn well without any tuning.

Cite

CITATION STYLE

APA

Blier, L., Wolinski, P., & Ollivier, Y. (2020). Learning with Random Learning Rates. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11907 LNAI, pp. 449–464). Springer. https://doi.org/10.1007/978-3-030-46147-8_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free