Training deep neural networks using conjugate gradient-like methods

Hideaki Iiduka; Yu Kobayashi

Journal ArticleOPEN ACCESS

Training deep neural networks using conjugate gradient-like methods

Electronics (Switzerland) (2020) 9(11) 1-25

DOI: 10.3390/electronics9111809

9Citations

15Readers

Abstract

The goal of this article is to train deep neural networks that accelerate useful adaptive learning rate optimization algorithms such as AdaGrad, RMSProp, Adam, and AMSGrad. To reach this goal, we devise an iterative algorithm combining the existing adaptive learning rate optimization algorithms with conjugate gradient-like methods, which are useful for constrained optimization. Convergence analyses show that the proposed algorithm with a small constant learning rate approximates a stationary point of a nonconvex optimization problem in deep learning. Furthermore, it is shown that the proposed algorithm with diminishing learning rates converges to a stationary point of the nonconvex optimization problem. The convergence and performance of the algorithm are demonstrated through numerical comparisons with the existing adaptive learning rate optimization algorithms for image and text classification. The numerical results show that the proposed algorithm with a constant learning rate is superior for training neural networks.

Author supplied keywords

Cite

CITATION STYLE

APA

Iiduka, H., & Kobayashi, Y. (2020). Training deep neural networks using conjugate gradient-like methods. Electronics (Switzerland), 9(11), 1–25. https://doi.org/10.3390/electronics9111809

Training deep neural networks using conjugate gradient-like methods

Abstract

Author supplied keywords

Cite

Register to see more suggestions