In recent years, deep learning represented by Convolutional Neural Network (CNN) has been one of the hottest topics of research. CNN inference process based models have been widely used in more and more computer vision applications. The execution speed of inference process is critical for applications, and the hardware acceleration method is mostly considered. To relieve the memory pressure, data quantization strategies are often used in hardware implementation. In this paper, a dynamic multi-precision fixed-point data quantization strategy for CNN has been proposed and used to quantify the floating-point data in trained CNN inference process. Results shows that our quantization strategy for LeNet model can reduce the accuracy loss from 22.2% to 5.9% at most, compared with previous static quantization strategy, when 8/4-bit quantization is used. When 16-bit quantization is used, only 0.03% accuracy loss is introduced by our quantization strategy with half memory footprint and bandwidth requirement comparing with 32-bit floating-point implementation.
CITATION STYLE
Shan, L., Zhang, M., Deng, L., & Gong, G. (2016). A dynamic multi-precision fixed-point data quantization strategy for convolutional neural network. In Communications in Computer and Information Science (Vol. 666 CCIS, pp. 102–111). Springer Verlag. https://doi.org/10.1007/978-981-10-3159-5_10
Mendeley helps you to discover research relevant for your work.