Large scale optimization with proximal stochastic newton-type gradient descent

Ziqiang Shi; Rujie Liu

Conference ProceedingsOPEN ACCESS

Large scale optimization with proximal stochastic newton-type gradient descent

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9284 691-704

DOI: 10.1007/978-3-319-23528-8_43

N/ACitations

6Readers

Abstract

In this work, we generalized and unified two recent completely different works of Jascha [10] and Lee [2] respectively into one by proposing the proximal stochastic Newton-type gradient (PROXTONE) method for optimizing the sums of two convex functions: one is the average of a huge number of smooth convex functions, and the other is a nonsmooth convex function. Our PROXTONE incorporates second order information to obtain stronger convergence results, that it achieves a linear convergence rate not only in the value of the objective function, but also for the solution. The proofs are simple and intuitive, and the results and technique can be served as a initiate for the research on the proximal stochastic methods that employ second order information. The methods and principles proposed in this paper can be used to do logistic regression, training of deep neural network and so on. Our numerical experiments shows that the PROXTONE achieves better computation performance than existing methods.

Cite

CITATION STYLE

APA

Shi, Z., & Liu, R. (2015). Large scale optimization with proximal stochastic newton-type gradient descent. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9284, pp. 691–704). Springer Verlag. https://doi.org/10.1007/978-3-319-23528-8_43

Large scale optimization with proximal stochastic newton-type gradient descent

Abstract

Cite

Register to see more suggestions