Learning low-precision neural networks without Straight-Through Estimator (STE)

Zhi Gang Liu; Matthew Mattina

Conference ProceedingsOPEN ACCESS

Learning low-precision neural networks without Straight-Through Estimator (STE)

IJCAI International Joint Conference on Artificial Intelligence (2019) 2019-August 3066-3072

DOI: 10.24963/ijcai.2019/425

16Citations

87Readers

Abstract

The Straight-Through Estimator (STE) [Hinton, 2012][Bengio et al., 2013] is widely used for back-propagating gradients through the quantization function, but the STE technique lacks a complete theoretical understanding. We propose an alternative methodology called alpha-blending (AB), which quantizes neural networks to low precision using stochastic gradient descent (SGD). Our (AB) method avoids STE approximation by replacing the quantized weight in the loss function by an affine combination of the quantized weight wq and the corresponding full-precision weight w with non-trainable scalar coefficient α and (1−α). During training, α is gradually increased from 0 to 1; the gradient updates to the weights are through the full precision term, (1−α)w, of the affine combination; the model is converted from full-precision to low precision progressively. To evaluate the (AB) method, a 1-bit BinaryNet [Hubara et al., 2016a] on CIFAR10 dataset and 8-bits, 4-bits MobileNet v1, ResNet 50 v1/2 on ImageNet are trained using the alpha-blending approach, and the evaluation indicates that AB improves top-1 accuracy by 0.9%, 0.82% and 2.93% respectively compared to the results of STE based quantization [Hubara et al., 2016a] 1 [Krishnamoorthi, 2018].

Cite

CITATION STYLE

APA

Liu, Z. G., & Mattina, M. (2019). Learning low-precision neural networks without Straight-Through Estimator (STE). In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2019-August, pp. 3066–3072). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2019/425

Learning low-precision neural networks without Straight-Through Estimator (STE)

Abstract

Cite

Register to see more suggestions