Queen Jane Approximately: Enabling Efficient Neural Network Inference with Context-Adaptivity

Octavian MacHidon; Davor Sluga; Veljko Pejović

Conference ProceedingsOPEN ACCESS

Queen Jane Approximately: Enabling Efficient Neural Network Inference with Context-Adaptivity

Proceedings of the 1st Workshop on Machine Learning and Systems, EuroMLSys 2021 (2021) 48-54

DOI: 10.1145/3437984.3458833

4Citations

7Readers

Get full text

Abstract

Recent advances in deep learning allow on-demand reduction of model complexity, without a need for re-training, thus enabling a dynamic trade-off between the inference accuracy and the energy savings. Approximate mobile computing, on the other hand, adapts the computation approximation level as the context of usage, and consequently the computation needs or result accuracy needs, vary. In this work, we propose a synergy between the two directions and develop a context-aware method for dynamically adjusting the width of an on-device neural network based on the input and context-dependent classification confidence. We implement our method on a human activity recognition neural network and through measurements on a real-world embedded device demonstrate that such a network would save up to 37.8% energy and induce only 1% loss of accuracy, if used for continuous activity monitoring in the field of elderly care.

Cite

CITATION STYLE

APA

MacHidon, O., Sluga, D., & Pejović, V. (2021). Queen Jane Approximately: Enabling Efficient Neural Network Inference with Context-Adaptivity. In Proceedings of the 1st Workshop on Machine Learning and Systems, EuroMLSys 2021 (pp. 48–54). Association for Computing Machinery, Inc. https://doi.org/10.1145/3437984.3458833

Queen Jane Approximately: Enabling Efficient Neural Network Inference with Context-Adaptivity

Abstract

Cite

Register to see more suggestions