Statistical mechanical analysis of learning dynamics of two-layer perceptron with multiple output units

8Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The plateau phenomenon, wherein the loss value stops decreasing during the process of learning, is troubling. Various studies suggest that the plateau phenomenon is frequently caused by the network being trapped in the singular region on the loss surface, a region that stems from the symmetrical structure of neural networks. However, these studies all deal with networks that have a one-dimensional output, and networks with a multidimensional output are overlooked. This paper uses a statistical mechanical formalization to analyze the dynamics of learning in a two-layer perceptron with multidimensional output. We derive order parameters that capture macroscopic characteristics of connection weights and the differential equations that they follow. We show that singular-region-driven plateaus diminish or vanish with multidimensional output, in a simple setting. We found that the more non-degenerative (i.e. far from one-dimensional output) the model is, the more plateaus are alleviated. Furthermore, we showed theoretically that singular-region-driven plateaus seldom occur in the learning process in the case of orthogonalized initializations.

Cite

CITATION STYLE

APA

Yoshida, Y., Karakida, R., Okada, M., & Amari, S. I. (2019). Statistical mechanical analysis of learning dynamics of two-layer perceptron with multiple output units. Journal of Physics A: Mathematical and Theoretical, 52(18). https://doi.org/10.1088/1751-8121/ab0669

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free