Large scale optimization with proximal stochastic newton-type gradient descent

N/ACitations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In this work, we generalized and unified two recent completely different works of Jascha [10] and Lee [2] respectively into one by proposing the proximal stochastic Newton-type gradient (PROXTONE) method for optimizing the sums of two convex functions: one is the average of a huge number of smooth convex functions, and the other is a nonsmooth convex function. Our PROXTONE incorporates second order information to obtain stronger convergence results, that it achieves a linear convergence rate not only in the value of the objective function, but also for the solution. The proofs are simple and intuitive, and the results and technique can be served as a initiate for the research on the proximal stochastic methods that employ second order information. The methods and principles proposed in this paper can be used to do logistic regression, training of deep neural network and so on. Our numerical experiments shows that the PROXTONE achieves better computation performance than existing methods.

Cite

CITATION STYLE

APA

Shi, Z., & Liu, R. (2015). Large scale optimization with proximal stochastic newton-type gradient descent. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9284, pp. 691–704). Springer Verlag. https://doi.org/10.1007/978-3-319-23528-8_43

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free