Queen Jane Approximately: Enabling Efficient Neural Network Inference with Context-Adaptivity

4Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Recent advances in deep learning allow on-demand reduction of model complexity, without a need for re-training, thus enabling a dynamic trade-off between the inference accuracy and the energy savings. Approximate mobile computing, on the other hand, adapts the computation approximation level as the context of usage, and consequently the computation needs or result accuracy needs, vary. In this work, we propose a synergy between the two directions and develop a context-aware method for dynamically adjusting the width of an on-device neural network based on the input and context-dependent classification confidence. We implement our method on a human activity recognition neural network and through measurements on a real-world embedded device demonstrate that such a network would save up to 37.8% energy and induce only 1% loss of accuracy, if used for continuous activity monitoring in the field of elderly care.

Cite

CITATION STYLE

APA

MacHidon, O., Sluga, D., & Pejović, V. (2021). Queen Jane Approximately: Enabling Efficient Neural Network Inference with Context-Adaptivity. In Proceedings of the 1st Workshop on Machine Learning and Systems, EuroMLSys 2021 (pp. 48–54). Association for Computing Machinery, Inc. https://doi.org/10.1145/3437984.3458833

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free