Post-training Piecewise Linear Quantization for Deep Neural Networks

Jun Fang; Ali Shafiee; Hamzah Abdel-Aziz; David Thorsley; Georgios Georgiadis; Joseph H. Hassoun

Conference Proceedings

Post-training Piecewise Linear Quantization for Deep Neural Networks

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12347 LNCS 69-86

DOI: 10.1007/978-3-030-58536-5_5

67Citations

127Readers

Get full text

Abstract

Quantization plays an important role in the energy-efficient deployment of deep neural networks on resource-limited devices. Post-training quantization is highly desirable since it does not require retraining or access to the full training dataset. The well-established uniform scheme for post-training quantization achieves satisfactory results by converting neural networks from full-precision to 8-bit fixed-point integers. However, it suffers from significant performance degradation when quantizing to lower bit-widths. In this paper, we propose a piecewise linear quantization (PWLQ) scheme (Code will be made available at https://github.com/jun-fang/PWLQ) to enable accurate approximation for tensor values that have bell-shaped distributions with long tails. Our approach breaks the entire quantization range into non-overlapping regions for each tensor, with each region being assigned an equal number of quantization levels. Optimal breakpoints that divide the entire range are found by minimizing the quantization error. Compared to state-of-the-art post-training quantization methods, experimental results show that our proposed method achieves superior performance on image classification, semantic segmentation, and object detection with minor overhead.

Author supplied keywords

Cite

CITATION STYLE

APA

Fang, J., Shafiee, A., Abdel-Aziz, H., Thorsley, D., Georgiadis, G., & Hassoun, J. H. (2020). Post-training Piecewise Linear Quantization for Deep Neural Networks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12347 LNCS, pp. 69–86). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-58536-5_5

Post-training Piecewise Linear Quantization for Deep Neural Networks

Abstract

Author supplied keywords

Cite

Register to see more suggestions