Abstract
There has been a recent surge in interest in dynamic inference technologies which can reduce the cost of inference, without sacrificing the accuracy of the model. These models are based on the assumption that not all parts of the output feature map (OFM) are equally important for all inputs. The parts of the output feature maps that are deemed unimportant for a certain input can be skipped entirely or computed at a lower precision, leading to reduced number of computation. In this paper we focus on one such technology that targets unimportant features in the spatial domain of OFM, called Precision Gating (PG). PG computes most features in low precision, to identify regions in the OFM where an object of interest is present, and computes high precision OFM for that region only. We show that PG leads to loss in accuracy when we push the MAC reduction achieved by a PG network. We identify orthogonal dynamic optimization opportunities not exploited by PG and show that the combined technologies can achieve far better results than their individual baseline. This Hybrid Model can achieve 1.92x computation savings on a CIFAR-10 model at an accuracy of 91.35%. At a similar computation savings, the PG model achieves an accuracy of 89.9%. Additionally, we show that PG leads to GEMM computations that are not hardware aware and propose a fix that makes PG technique CPU friendly without losing accuracy.
Author supplied keywords
Cite
CITATION STYLE
Huang, X., Thakker, U., Gope, D., & Beu, J. (2020). Pushing the Envelope of Dynamic Spatial Gating technologies. In AIChallengeIoT 2020 - Proceedings of the 2020 2nd International Workshop on Challenges in Artificial Intelligence and Machine Learning for Internet of Things (pp. 21–26). Association for Computing Machinery, Inc. https://doi.org/10.1145/3417313.3429380
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.