A Parameterized Parallel Design Approach to Efficient Mapping of CNNs onto FPGA

3Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

Abstract

In recent years, Convolution Neural Networks (CNNs) have been widely applied to some artificial intelligence (AI) systems such as computer vision. Among many existing hardware accelerators, FPGA is regarded as a suitable platform for the implementation of CNNs because of its high energy efficiency and flexible reconfigurability. In this paper, a parameterized design approach is proposed to explore the maximum parallelism that could be possibly implemented in mapping a CNN algorithm onto targeted FPGA resources. Four types of parallelism are employed in our parameterized design to fully exploit the processing resources available in FPGA. Meanwhile, a hardware library consisting of a set of modules is established to accommodate various CNN models. Further, an algorithm is proposed to find the optimal level of parallelism dedicated to a constrained amount of resources. As a case study, the typical LeNet-5 is implemented on Xilinx Zynq7020. Compared with the existing works using the high-level synthesis design flow, our design obtains higher FPS and lower latency under the premise of using fewer LUTs and FFs.

Cite

CITATION STYLE

APA

Mao, N., Yang, H., & Huang, Z. (2023). A Parameterized Parallel Design Approach to Efficient Mapping of CNNs onto FPGA. Electronics (Switzerland), 12(5). https://doi.org/10.3390/electronics12051106

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free