Numerous intriguing optimization problems arise as a result of the advancement of machine learning. The stochastic first-order method is the predominant choice for those problems due to its high efficiency. However, the negative effects of noisy gradient estimates and high nonlinearity of the loss function result in a slow convergence rate. Second-order algorithms have their typical advantages in dealing with highly nonlinear and ill-conditioning problems. This paper provides a review on recent developments in stochastic variants of quasi-Newton methods, which construct the Hessian approximations using only gradient information. We concentrate on BFGS-based methods in stochastic settings and highlight the algorithmic improvements that enable the algorithm to work in various scenarios. Future research on stochastic quasi-Newton methods should focus on enhancing its applicability, lowering the computational and storage costs, and improving the convergence rate.
CITATION STYLE
Guo, T. D., Liu, Y., & Han, C. Y. (2023, June 1). An Overview of Stochastic Quasi-Newton Methods for Large-Scale Machine Learning. Journal of the Operations Research Society of China. Springer. https://doi.org/10.1007/s40305-023-00453-9
Mendeley helps you to discover research relevant for your work.