Substituting Convolutions for Neural Network Compression

1Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Many practitioners would like to deploy deep, convolutional neural networks in memory-limited scenarios, e.g., on an embedded device. However, with an abundance of compression techniques available it is not obvious how to proceed; many bring with them additional hyperparameter tuning, and are specific to particular network types. In this paper, we propose a simple compression technique that is general, easy to apply, and requires minimal tuning. Given a large, trained network, we propose (i) substituting its expensive convolutions with cheap alternatives, leaving the overall architecture unchanged; (ii) treating this new network as a student and training it with the original as a teacher through distillation. We demonstrate this approach separately for (i) networks predominantly consisting of full 3 × 3 convolutions and (ii) 1 × 1 or pointwise convolutions which together make up the vast majority of contemporary networks. We are able to leverage a number of methods that have been developed as efficient alternatives to fully-connected layers for pointwise substitution, allowing us provide Pareto-optimal benefits in efficiency/accuracy.

Cite

CITATION STYLE

APA

Crowley, E. J., Gray, G., Turner, J., & Storkey, A. (2021). Substituting Convolutions for Neural Network Compression. IEEE Access, 9, 83199–83213. https://doi.org/10.1109/ACCESS.2021.3086321

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free