Non-convex optimization using parameter continuation methods for deep neural networks

1Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Numerical parameter continuation methods are popularly utilized to optimize non-convex problems. These methods have had many applications in Physics and Mathematical analysis such as bifurcation study of dynamical systems. However, as far as we know, such efficient methods have seen relatively limited use in the optimization of neural networks. In this chapter, we propose a novel training method for deep neural networks based on the ideas from parameter continuation methods and compare them with widely practiced methods such as Stochastic Gradient Descent (SGD), AdaGrad, RMSProp and ADAM. Transfer and curriculum learning have recently shown exceptional performance enhancements in deep learning and are intuitively similar to the homotopy or continuation techniques. However, our proposed methods leverage decades of theoretical and computational work and can be viewed as an initial bridge between those techniques and deep neural networks. In particular, we illustrate a method that we call Natural Parameter Adaption Continuation with Secant approximation (NPACS). Herein we transform regularly used activation functions to their homotopic versions. Such a version allows one to decompose the complex optimization problem into a sequence of problems, each of which is provided with a good initial guess based upon the solution of the previous problem. NPACS uses the above-mentioned system uniquely with ADAM to obtain faster convergence. We demonstrate the effectiveness of our method on standard benchmark problems and compute local minima more rapidly and achieve lower generalization error than contemporary techniques in a majority of cases.

Cite

CITATION STYLE

APA

Nilesh Pathak, H., & Clinton Paffenroth, R. (2021). Non-convex optimization using parameter continuation methods for deep neural networks. In Advances in Intelligent Systems and Computing (Vol. 1232, pp. 273–298). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-15-6759-9_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free