Deep Neural Networks emerged in the last years as the most promising approach to the smart processing of data. However, their effectiveness is still a challenge when they are implemented in resource-constrained architectures, such as those of edge devices often requiring at least the inference phase. This work investigates the impact of two different weight compression techniques initially designed and tested for DNN hardware accelerators in a scenario involving general-purpose low-end hardware. After applying several levels of weight compression on the MobileNet DNN model, we show how accelerator-oriented weight compression techniques can positively impact both memory traffic pressure and inference/latency figures, resulting in some cases in a good trade-off in terms of accuracy loss.
CITATION STYLE
Canzonieri, G., Monteleone, S., Palesi, M., Russo, E., & Patti, D. (2022). Analyzing the Impact of DNN Hardware Accelerators-Oriented Compression Techniques on General-Purpose Low-End Boards. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13475 LNCS, pp. 143–155). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-14391-5_11
Mendeley helps you to discover research relevant for your work.