A reconfigurable streaming processor for real-time low-power execution of convolutional neural networks at the edge

3Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

With the recent advances in machine learning and the deep learning paradigm, there is a huge demand to push the data analytics and cognitive inference to the edge of the network near the data producers and sensors. Edge analytics are essential for real-time video analytics and situational awareness; which is required for the wide range of cyber-physical applications such as smart transportation, smart cities, and smart health. To this end, novel architectures and platforms are required to enable real-time low-power deep learning execution at the edge. This paper introduces a novel reconfigurable architecture for real-time execution of deep learning and in particular convolutional Neural Networks (CNNs) at the edge of the network, close to the video camera. The proposed architecture offers a set of coarse-grain function blocks required for realizing CNN algorithms. The macro-pipelined datapath is created by chaining the function blocks with respect to the topology of the target network. The function blocks operate over the streaming pixels (directly fed from the camera interface) in a producer/consumer fashion. At the same time, function blocks offer enough flexibility to adjust the processing with respect to area, power, and performance requirements. This paper primarily focuses on the two first layers of CNNs as the two most compute-intensive layers of CNN network. Our implementation on Xilinx Zynq FPGAs, for the first two layers of the SqueezNet Network, shows 315 mW power consumption when designed at 30 fps, with only a 0.24 ms one-time-latency. In contrast, the Nvidia Tegra TX2 GPU is limited to perform at 32.2 fps due to the 31.4 ms delay, with a much higher power consumption (7.5 W).

Cite

CITATION STYLE

APA

Sanchez, J., Soltani, N., Kulkarni, P., Chamarthi, R. V., & Tabkhi, H. (2018). A reconfigurable streaming processor for real-time low-power execution of convolutional neural networks at the edge. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10973 LNCS, pp. 49–64). Springer Verlag. https://doi.org/10.1007/978-3-319-94340-4_4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free