Towards accurate low bit-width quantization with multiple phase adaptations

Zhaoyi Yan; Yemin Shi; Yaowei Wang; Mingkui Tan; Zheyang Li; Wenming Tan; Yonghong Tian

Conference ProceedingsOPEN ACCESS

Towards accurate low bit-width quantization with multiple phase adaptations

AAAI 2020 - 34th AAAI Conference on Artificial Intelligence (2020) 6591-6598

DOI: 10.1609/aaai.v34i04.6134

1Citations

11Readers

Abstract

Low bit-width model quantization is highly desirable when deploying a deep neural network on mobile and edge devices. Quantization is an effective way to reduce the model size with low bit-width weight representation. However, the unacceptable accuracy drop hinders the development of this approach. One possible reason for this is that the weights in quantization intervals are directly assigned to the center. At the same time, some quantization applications are limited by the various of different network models. Accordingly, in this paper, we propose Multiple Phase Adaptations (MPA), a framework designed to address these two problems. Firstly, weights in the target interval are assigned to center by gradually spreading the quantization range. During the MPA process, the accuracy drop can be compensated for the unquantized parts. Moreover, as MPA does not introduce hyperparameters that depend on different models or bit-width, the framework can be conveniently applied to various models. Extensive experiments demonstrate that MPA achieves higher accuracy than most existing methods on classification tasks for AlexNet, VGG-16 and ResNet.

Cite

CITATION STYLE

APA

Yan, Z., Shi, Y., Wang, Y., Tan, M., Li, Z., Tan, W., & Tian, Y. (2020). Towards accurate low bit-width quantization with multiple phase adaptations. In AAAI 2020 - 34th AAAI Conference on Artificial Intelligence (pp. 6591–6598). AAAI press. https://doi.org/10.1609/aaai.v34i04.6134

Towards accurate low bit-width quantization with multiple phase adaptations

Abstract

Cite

Register to see more suggestions