Do subsampled newton methods work for high-dimensional data?

6Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.

Abstract

Subsampled Newton methods approximate Hessian matrices through subsampling techniques to alleviate the per-iteration cost. Previous results require Ω(d) samples to approximate Hessians, where d is the dimension of data points, making it less practical for high-dimensional data. The situation is deteriorated when d is comparably as large as the number of data points n, which requires to take the whole dataset into account, making subsampling not useful. This paper theoretically justifies the effectiveness of subsampled Newton methods on strongly convex empirical risk minimization with high dimensional data. Specifically, we provably require only Θ(~ dγeff) samples for approximating the Hessian matrices, where dγeff is the γ-ridge leverage and can be much smaller than d as long as nγ 1. Our theories work for three types of Newton methods: subsampled Netwon, distributed Newton, and proximal Newton.

Cite

CITATION STYLE

APA

Li, X., Wang, S., & Zhang, Z. (2020). Do subsampled newton methods work for high-dimensional data? In AAAI 2020 - 34th AAAI Conference on Artificial Intelligence (pp. 4723–4730). AAAI press. https://doi.org/10.1609/aaai.v34i04.5905

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free