Memory Bandwidth and Energy Efficiency Optimization of Deep Convolutional Neural Network Accelerators

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Deep convolutional neural networks (DNNs) achieve state-of-the-art accuracy but at the cost of massive computation and memory operations. Although highly-parallel devices effectively meet the requirements of computation, energy efficiency is still a tough nut. In this paper, we present two novel computation sequences, NHWCfine and NHWCcoarse, for the DNN accelerators. Then we combine two computation sequences with appropriate data layouts. The proposed modes enable continuous memory access patterns and reduce the number of memory accesses, which is achieved by leveraging and transforming the local data reuse of weights and feature maps in high-dimensional convolutions. Experiments with various convolutional layers show that the proposed modes made up of computing sequences and data layouts are more energy efficient than the baseline mode on various networks. The reduction for total energy consumption is up to 4.10×. The reduction for the off-chip memory access latency is up to 5.11×.

Cite

CITATION STYLE

APA

Nie, Z., Li, Z., Wang, L., Guo, S., & Dou, Q. (2018). Memory Bandwidth and Energy Efficiency Optimization of Deep Convolutional Neural Network Accelerators. In Communications in Computer and Information Science (Vol. 908, pp. 15–29). Springer Verlag. https://doi.org/10.1007/978-981-13-2423-9_2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free