Residual Learning for FC Kernels of Convolutional Network

2Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

One of the most important steps in training a neural network is choosing its depth. Theoretically, it is possible to construct a complex decision-making function by cascading a number of shallow networks. It can produce a similar in accuracy result while providing a significant performance cost benefit. In practice, at some point, just increasing the depth of a network can actually decrease its performance due to overlearning. In literature, this is called “vanishing gradient descent”. Vanishing gradient descent is observed as a vanishing decrease of magnitudes of gradients of weights for each subsequent layer, effectively preventing the weight from changing its value in the lower layers of a deep network when applying the backward propagation of errors algorithm. There is an approach called Residual Network (ResNet) to solve this problem for standard convolutional networks. However, the ResNet solves the problem only partially, as the resulting network is not sequential, but is an ensemble of shallow networks with all drawbacks typical for them. In this article, we investigate a convolutional network with fully connected layers (so-called network in network architecture, NiN) and suggest another way to build an ensemble of shallow networks. In our case, we gradually reduce the number of parallel connections by using sequential network connections. This allows to eliminate the influence of the vanishing gradient descent and to reduce the redundancy of the network by using all weight coefficients and not using residual blocks as ResNet does. For this method to work it is not required to change the network architecture, but only needed to properly initialize its weights.

Cite

CITATION STYLE

APA

Alexeev, A., Matveev, Y., Matveev, A., & Pavlenko, D. (2019). Residual Learning for FC Kernels of Convolutional Network. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11728 LNCS, pp. 361–372). Springer Verlag. https://doi.org/10.1007/978-3-030-30484-3_30

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free