Very deep convolutional neural networks (CNNs) have been firmly established as the primary methods for many computer vision tasks. However, most state-of-the-art CNNs are large, which results in high inference latency. Depth-wise separable convolution has been proposed for image recognition tasks on platforms with limited computation power, such as robots and self-driving cars. Any regular deep CNN has a depth-wise separable counterpart, which is faster, but less accurate, when equally trained. In this paper, we propose a novel decomposition approach based on SVD, namely depth-wise decomposition, for converting regular convolutions into depth-wise separable convolutions post-training, while maintaining high accuracy. We show that our approach generalizes to the multi-channel and multi-layer cases, by applying Generalized Singular Value Decomposition (GSVD). We conduct thorough experiments with the ShuffleNet V2 model on a large-scale image recognition dataset: ImageNet. Our approach outperforms the baseline, channel decomposition. Moreover, our approach improves the Top-1 accuracy of ShuffleNet V2 by ∼2%.
CITATION STYLE
He, Y., Qian, J., Le, C. X., Hetang, C., Lyu, Q., Wang, W., & Yue, T. (2023). Depth-wise Decomposition for Accelerating Separable Convolutions in Efficient Convolutional Neural Networks. Advances in Artificial Intelligence and Machine Learning, 3(4), 1699–1719. https://doi.org/10.54364/AAIML.2023.1197
Mendeley helps you to discover research relevant for your work.