Residual Learning for FC Kernels of Convolutional Network

Alexey Alexeev; Yuriy Matveev; Anton Matveev; Dmitry Pavlenko

Conference Proceedings

Residual Learning for FC Kernels of Convolutional Network

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11728 LNCS 361-372

DOI: 10.1007/978-3-030-30484-3_30

2Citations

3Readers

Get full text

Abstract

One of the most important steps in training a neural network is choosing its depth. Theoretically, it is possible to construct a complex decision-making function by cascading a number of shallow networks. It can produce a similar in accuracy result while providing a significant performance cost benefit. In practice, at some point, just increasing the depth of a network can actually decrease its performance due to overlearning. In literature, this is called “vanishing gradient descent”. Vanishing gradient descent is observed as a vanishing decrease of magnitudes of gradients of weights for each subsequent layer, effectively preventing the weight from changing its value in the lower layers of a deep network when applying the backward propagation of errors algorithm. There is an approach called Residual Network (ResNet) to solve this problem for standard convolutional networks. However, the ResNet solves the problem only partially, as the resulting network is not sequential, but is an ensemble of shallow networks with all drawbacks typical for them. In this article, we investigate a convolutional network with fully connected layers (so-called network in network architecture, NiN) and suggest another way to build an ensemble of shallow networks. In our case, we gradually reduce the number of parallel connections by using sequential network connections. This allows to eliminate the influence of the vanishing gradient descent and to reduce the redundancy of the network by using all weight coefficients and not using residual blocks as ResNet does. For this method to work it is not required to change the network architecture, but only needed to properly initialize its weights.

Author supplied keywords

Cite

CITATION STYLE

APA

Alexeev, A., Matveev, Y., Matveev, A., & Pavlenko, D. (2019). Residual Learning for FC Kernels of Convolutional Network. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11728 LNCS, pp. 361–372). Springer Verlag. https://doi.org/10.1007/978-3-030-30484-3_30

Residual Learning for FC Kernels of Convolutional Network

Abstract

Author supplied keywords

Cite

Register to see more suggestions