In hierarchical models, such as neural networks, there exist complex singular structures. The singularity is known to affect estimation performances and learning dynamics of the models. Recently, there have been a number of studies on properties of obtained estimators for the models, but there are few studies on the dynamical properties of learning used for obtaining the estimators. Using two-layer neural networks, we investigate influences of singularities on dynamics of standard gradient learning and natural gradient learning under various learning conditions. In the standard gradient learning, we found a quasi-plateau phenomenon, which is severer than the well known plateau in some cases. The slow convergence due to the quasi-plateau and plateau becomes extremely serious when an optimal point is in a neighborhood of a singularity. In the natural gradient learning, however, the quasi-plateau and plateau are not observed and convergence speed is hardly affected by singularity. © Springer-Verlag Berlin Heidelberg 2004.
CITATION STYLE
Park, H., Inoue, M., & Okada, M. (2004). Learning dynamics of neural networks with singularity - Standard gradient vs. natural gradient. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3157, pp. 282–291). Springer Verlag. https://doi.org/10.1007/978-3-540-28633-2_31
Mendeley helps you to discover research relevant for your work.