Many machine learning, and statistical inference problems require minimization of a composition of expected value functions (CEVF). Of particular interest is the finite-sum versions of such compositional optimization problems (FS-CEVF). Compositional stochastic variance reduced gradient (C-SVRG) methods that combine stochastic compositional gradient descent (SCGD) and stochastic variance reduced gradient descent (SVRG) methods are the state-of-the-art methods for FS-CEVF problems. We introduce compositional stochastic average gradient descent (C-SAG) a novel extension of the stochastic average gradient method (SAG) to minimize composition of finite-sum functions. C-SAG, like SAG, estimates gradient by incorporating memory of previous gradient information. We present theoretical analyses of C-SAG which show that C-SAG, like C-SVRG, achieves a linear convergence rate for strongly convex objective function; However, C-CAG achieves lower oracle query complexity per iteration than C-SVRG. Finally, we present results of experiments showing that C-SAG converges substantially faster than full gradient (FG), as well as C-SVRG.
CITATION STYLE
Hsieh, T. Y., EL-Manzalawy, Y., Sun, Y., & Honavar, V. (2018). Compositional Stochastic Average Gradient for Machine Learning and Related Applications. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11314 LNCS, pp. 740–752). Springer Verlag. https://doi.org/10.1007/978-3-030-03493-1_77
Mendeley helps you to discover research relevant for your work.