Randomized K-FACs: Speeding Up K-FAC with Randomized Numerical Linear Algebra

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

k-fac is a successful tractable implementation of Natural Gradient for Deep Learning, which nevertheless suffers from the requirement to compute the inverse of the Kronecker factors (through an eigen-decomposition). This can be very time-consuming (or even prohibitive) when these factors are large. In this paper, we theoretically show that, owing to the exponential-average construction paradigm of the Kronecker factors that is typically used, their eigen-spectrum must decay. We show numerically that in practice this decay is very rapid, leading to the idea that we could save substantial computation by only focusing on the first few eigen-modes when inverting the Kronecker-factors. Randomized Numerical Linear Algebra provides us with the necessary tools to do so. Numerical results show we obtain ≈ 2.5 × reduction in per-epoch time and ≈ 3.3 × reduction in time to target accuracy. We compare our proposed k-fac sped-up versions with a more computationally efficient NG implementation, seng, and observe we perform on par with it.

Cite

CITATION STYLE

APA

Puiu, C. O. (2022). Randomized K-FACs: Speeding Up K-FAC with Randomized Numerical Linear Algebra. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13756 LNCS, pp. 411–422). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-21753-1_40

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free