Effect of initial configuration of weights on training and function of artificial neural networks

13Citations
Citations of this article
25Readers
Mendeley users who have this article in their library.

Abstract

The function and performance of neural networks are largely determined by the evolution of their weights and biases in the process of training, starting from the initial configuration of these parameters to one of the local minima of the loss function. We perform the quantitative statistical characterization of the deviation of the weights of two-hidden-layer feedforward ReLU networks of various sizes trained via Stochastic Gradient Descent (SGD) from their initial random configuration. We compare the evolution of the distribution function of this deviation with the evolution of the loss during training. We observed that successful training via SGD leaves the network in the close neighborhood of the initial configuration of its weights. For each initial weight of a link we measured the distribution function of the deviation from this value after training and found how the moments of this distribution and its peak depend on the initial weight. We explored the evolution of these deviations during training and observed an abrupt increase within the overfitting region. This jump occurs simultaneously with a similarly abrupt increase recorded in the evolution of the loss function. Our results suggest that SGD’s ability to efficiently find local minima is restricted to the vicinity of the random initial configuration of weights.

References Powered by Scopus

Deep learning

63555Citations
N/AReaders
Get full text

Gradient-based learning applied to document recognition

44106Citations
N/AReaders
Get full text

Delving deep into rectifiers: Surpassing human-level performance on imagenet classification

15473Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Modeling and process parameter optimization of laser cutting based on artificial neural network and intelligent optimization algorithm

16Citations
N/AReaders
Get full text

ANN approach to evaluate the effects of supplementary cementitious materials on the compressive strength of recycled aggregate concrete

10Citations
N/AReaders
Get full text

A neural network approach for the solution of Van der Pol-Mathieu-Duffing oscillator model

7Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Jesus, R. J., Antunes, M. L., da Costa, R. A., Dorogovtsev, S. N., Mendes, J. F. F., & Aguiar, R. L. (2021). Effect of initial configuration of weights on training and function of artificial neural networks. Mathematics, 9(18). https://doi.org/10.3390/math9182246

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 9

64%

Researcher 4

29%

Lecturer / Post doc 1

7%

Readers' Discipline

Tooltip

Engineering 6

46%

Computer Science 3

23%

Chemical Engineering 2

15%

Economics, Econometrics and Finance 2

15%

Save time finding and organizing research with Mendeley

Sign up for free