A dynamic multi-precision fixed-point data quantization strategy for convolutional neural network

Lei Shan; Minxuan Zhang; Lin Deng; Guohui Gong

Conference Proceedings

A dynamic multi-precision fixed-point data quantization strategy for convolutional neural network

Communications in Computer and Information Science (2016) 666 CCIS 102-111

DOI: 10.1007/978-981-10-3159-5_10

19Citations

8Readers

Get full text

Abstract

In recent years, deep learning represented by Convolutional Neural Network (CNN) has been one of the hottest topics of research. CNN inference process based models have been widely used in more and more computer vision applications. The execution speed of inference process is critical for applications, and the hardware acceleration method is mostly considered. To relieve the memory pressure, data quantization strategies are often used in hardware implementation. In this paper, a dynamic multi-precision fixed-point data quantization strategy for CNN has been proposed and used to quantify the floating-point data in trained CNN inference process. Results shows that our quantization strategy for LeNet model can reduce the accuracy loss from 22.2% to 5.9% at most, compared with previous static quantization strategy, when 8/4-bit quantization is used. When 16-bit quantization is used, only 0.03% accuracy loss is introduced by our quantization strategy with half memory footprint and bandwidth requirement comparing with 32-bit floating-point implementation.

Author supplied keywords

Cite

CITATION STYLE

APA

Shan, L., Zhang, M., Deng, L., & Gong, G. (2016). A dynamic multi-precision fixed-point data quantization strategy for convolutional neural network. In Communications in Computer and Information Science (Vol. 666 CCIS, pp. 102–111). Springer Verlag. https://doi.org/10.1007/978-981-10-3159-5_10

A dynamic multi-precision fixed-point data quantization strategy for convolutional neural network

Abstract

Author supplied keywords

Cite

Register to see more suggestions