Neural Spectrum Alignment: Empirical Study

5Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Expressiveness and generalization of deep models was recently addressed via the connection between neural networks (NNs) and kernel learning, where first-order dynamics of NN during a gradient-descent (GD) optimization were related to gradient similarity kernel, also known as Neural Tangent Kernel (NTK)[9]. In the majority of works this kernel is considered to be time-invariant[9, 13]. In contrast, we empirically explore these properties along the optimization and show that in practice top eigenfunctions of NTK align toward the target function learned by NN which improves the overall optimization performance. Moreover, these top eigenfunctions serve as basis functions for NN output - a function represented by NN is spanned almost completely by them for the entire optimization process. Further, we study how learning rate decay affects the neural spectrum. We argue that the presented phenomena may lead to a more complete theoretical understanding behind NN learning.

Cite

CITATION STYLE

APA

Kopitkov, D., & Indelman, V. (2020). Neural Spectrum Alignment: Empirical Study. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12397 LNCS, pp. 168–179). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-61616-8_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free