Homogeneous vector capsules enable adaptive gradient descent in convolutional neural networks

9Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Neural networks traditionally produce a scalar value for an activated neuron. Capsules, on the other hand, produce a vector of values, which has been shown to correspond to a single, composite feature wherein the values of the components of the vectors indicate properties of the feature such as transformation or contrast. We present a new way of parameterizing and training capsules that we refer to as homogeneous vector capsules (HVCs). We demonstrate, experimentally, that altering a convolutional neural network (CNN) to use HVCs can achieve superior classification accuracy without increasing the number of parameters or operations in its architecture as compared to a CNN using a single final fully connected layer. Additionally, the introduction of HVCs enables the use of adaptive gradient descent, reducing the dependence a model’s achievable accuracy has on the finely tuned hyperparameters of a non-adaptive optimizer. We demonstrate our method and results using two neural network architectures. For the CNN architecture referred to as Inception v3, replacing the fully connected layers with HVCs increased the test accuracy by an average of 1.32% across all experiments conducted. For a simple monolithic CNN, we show HVCs improve test accuracy by an average of 19.16%.

Cite

CITATION STYLE

APA

Byerly, A., & Kalganova, T. (2021). Homogeneous vector capsules enable adaptive gradient descent in convolutional neural networks. IEEE Access, 9, 48519–48530. https://doi.org/10.1109/ACCESS.2021.3066842

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free