Software-Defined FPGA Accelerator Design for Mobile Deep Learning Applications

6Citations
Citations of this article
27Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Convolutional neural networks (CNNs) have been successfully used to attack problems such as object recognition, object detection, semantic segmentation, and scene understanding. The rapid development of deep learning goes hand by hand with the adaptation of GPUs for accelerating its processes, such as network training and inference. Even though FPGA design exists long before the use of GPUs for accelerating computations and despite the fact that high-level synthesis (HLS) tools are getting more attractive, the adaptation of FPGAs for deep learning research and application development is poor due to the requirement of hardware design related expertise. This work presents a workflow for deep learning mobile application acceleration on small low-cost low-power FPGA devices using HLS tools. This workflow eases the design of an improved version of the SqueezeJet accelerator used for the speedup of mobile-friendly low-parameter ImageNet class CNNs, such as the SqueezeNet v1.1 and the ZynqNet. Additionally, the workflow includes the development of an HLS-driven analytical model which is used for performance estimation of the accelerator.

Cite

CITATION STYLE

APA

Mousouliotis, P. G., & Petrou, L. P. (2019). Software-Defined FPGA Accelerator Design for Mobile Deep Learning Applications. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11444 LNCS, pp. 68–77). Springer Verlag. https://doi.org/10.1007/978-3-030-17227-5_6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free