Deep networks have achieved great success in many areas in recent years. However, with the increasing sophistication of deep neural networks (DNNs), the memory consumption and computational cost expand exponentially, greatly hindering their application in mobile devices and other limited resources. Therefore, there is impending necessity to consider model compression and acceleration without affecting the inference accuracy. In this paper, we review the recent popular techniques for compressing and accelerating deep networks. Those methods could be broadly divided into four categories: parameter pruning and sharing, low rank approximation, sparse regularization constraints and network weight low-bit quantization. The advantages and disadvantages of different compression and acceleration methods are also described in detail, other types of approach are also introduced in our paper, and future prospects for the field are given finally.
CITATION STYLE
Long, X., Ben, Z., & Liu, Y. (2019). A Survey of Related Research on Compression and Acceleration of Deep Neural Networks. In Journal of Physics: Conference Series (Vol. 1213). Institute of Physics Publishing. https://doi.org/10.1088/1742-6596/1213/5/052003
Mendeley helps you to discover research relevant for your work.