A Survey of Related Research on Compression and Acceleration of Deep Neural Networks

13Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Deep networks have achieved great success in many areas in recent years. However, with the increasing sophistication of deep neural networks (DNNs), the memory consumption and computational cost expand exponentially, greatly hindering their application in mobile devices and other limited resources. Therefore, there is impending necessity to consider model compression and acceleration without affecting the inference accuracy. In this paper, we review the recent popular techniques for compressing and accelerating deep networks. Those methods could be broadly divided into four categories: parameter pruning and sharing, low rank approximation, sparse regularization constraints and network weight low-bit quantization. The advantages and disadvantages of different compression and acceleration methods are also described in detail, other types of approach are also introduced in our paper, and future prospects for the field are given finally.

Cite

CITATION STYLE

APA

Long, X., Ben, Z., & Liu, Y. (2019). A Survey of Related Research on Compression and Acceleration of Deep Neural Networks. In Journal of Physics: Conference Series (Vol. 1213). Institute of Physics Publishing. https://doi.org/10.1088/1742-6596/1213/5/052003

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free