Fast curvature matrix-vector products

Nicol N. Schraudolph

Conference Proceedings

Fast curvature matrix-vector products

Schraudolph N

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2001) 2130 19-26

DOI: 10.1007/3-540-44668-0_4

6Citations

36Readers

Get full text

Abstract

The Gauss-Newton approximation of the Hessian guarantees positive semi-definiteness while retaining more second-order information than the Fisher information. We extend it from nonlinear least squares to all differentiable objectives such that positive semi-definiteness is maintained for the standard loss functions in neural network regression and classification. We give efficient algorithms for computing the product of extended Gauss-Newton and Fisher information matrices with arbitrary vectors, using techniques similar to but even cheaper than the fast Hessian-vector product [1]. The stability of SMD [2,3,4,5], a learning rate adaptation method that uses curvature matrix-vector products, improves when the extended Gauss-Newton matrix is substituted for the Hessian.

Cite

CITATION STYLE

APA

Schraudolph, N. N. (2001). Fast curvature matrix-vector products. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2130, pp. 19–26). Springer Verlag. https://doi.org/10.1007/3-540-44668-0_4

Fast curvature matrix-vector products

Abstract

Cite

Register to see more suggestions