Avoiding Kernel Fixed Points: Computing with ELU and GELU Infinite Networks

N/ACitations
Citations of this article
15Readers
Mendeley users who have this article in their library.

Abstract

Analysing and computing with Gaussian processes arising from infinitely wide neural networks has recently seen a resurgence in popularity. Despite this, many explicit covariance functions of networks with activation functions used in modern networks remain unknown. Furthermore, while the kernels of deep networks can be computed iteratively, theoretical understanding of deep kernels is lacking, particularly with respect to fixed-point dynamics. Firstly, we derive the covariance functions of multi-layer perceptrons (MLPs) with exponential linear units (ELU) and Gaussian error linear units (GELU) and evaluate the performance of the limiting Gaussian processes on some benchmarks. Secondly, and more generally, we analyse the fixed-point dynamics of iterated kernels corresponding to a broad range of activation functions. We find that unlike some previously studied neural network kernels, these new kernels exhibit non-trivial fixed-point dynamics which are mirrored in finite-width neural networks. The fixed point behaviour present in some networks explains a mechanism for implicit regularisation in overparameterised deep models. Our results relate to both the static iid parameter conjugate kernel and the dynamic neural tangent kernel constructions1

Cite

CITATION STYLE

APA

Tsuchida, R., Pearce, T., van der Heide, C., Roosta, F., & Gallagher, M. (2021). Avoiding Kernel Fixed Points: Computing with ELU and GELU Infinite Networks. In 35th AAAI Conference on Artificial Intelligence, AAAI 2021 (Vol. 11B, pp. 9967–9977). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v35i11.17197

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free