A dynamic multi-precision fixed-point data quantization strategy for convolutional neural network

19Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In recent years, deep learning represented by Convolutional Neural Network (CNN) has been one of the hottest topics of research. CNN inference process based models have been widely used in more and more computer vision applications. The execution speed of inference process is critical for applications, and the hardware acceleration method is mostly considered. To relieve the memory pressure, data quantization strategies are often used in hardware implementation. In this paper, a dynamic multi-precision fixed-point data quantization strategy for CNN has been proposed and used to quantify the floating-point data in trained CNN inference process. Results shows that our quantization strategy for LeNet model can reduce the accuracy loss from 22.2% to 5.9% at most, compared with previous static quantization strategy, when 8/4-bit quantization is used. When 16-bit quantization is used, only 0.03% accuracy loss is introduced by our quantization strategy with half memory footprint and bandwidth requirement comparing with 32-bit floating-point implementation.

Author supplied keywords

Cite

CITATION STYLE

APA

Shan, L., Zhang, M., Deng, L., & Gong, G. (2016). A dynamic multi-precision fixed-point data quantization strategy for convolutional neural network. In Communications in Computer and Information Science (Vol. 666 CCIS, pp. 102–111). Springer Verlag. https://doi.org/10.1007/978-981-10-3159-5_10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free