A Novel FPGA Accelerator Design for Real-Time and Ultra-Low Power Deep Convolutional Neural Networks Compared with Titan X GPU

Shuai Li; Yukui Luo; Kuangyuan Sun; Nandakishor Yadav; Kyuwon Ken Choi

Journal ArticleOPEN ACCESS

A Novel FPGA Accelerator Design for Real-Time and Ultra-Low Power Deep Convolutional Neural Networks Compared with Titan X GPU

IEEE Access (2020) 8 105455-105471

DOI: 10.1109/ACCESS.2020.3000009

58Citations

51Readers

Abstract

Convolutional neural networks (CNNs) based deep learning algorithms require high data flow and computational intensity. For real-time industrial applications, they need to overcome challenges such as high data bandwidth requirement and power consumption on hardware platforms. In this work, we have analyzed in detail the data dependency in the CNN accelerator and propose specific pipelined operations and data organized manner to design a high throughput CNN accelerator on FPGA. Besides, we have optimized the kernel operations to obtain a high power efficiency. The proposed CNN accelerator supports image classification and real-time object detection with high accuracy. The evaluation results show that our CNN-based FPGA accelerator can achieve 740 Giga operations per second (GOPS) at 200 MHz with kernel power of 12.2 watts on Intel Arria 10 FPGA. For object detection tasks, our system can achieve 105 fps with 56.5 mAP or 25 fps with 73.6 mAP on VOC dataset. Since we use the mixed fixed-point data representation, the detection accuracy is comparable with the GPU-based YOLO V2 framework. The power efficiency of our system is sim 3.3 times better than Titan X GPU and sim 418 times better than Intel E5-2620 V4 CPU.

Author supplied keywords

Cite

CITATION STYLE

APA

Li, S., Luo, Y., Sun, K., Yadav, N., & Choi, K. K. (2020). A Novel FPGA Accelerator Design for Real-Time and Ultra-Low Power Deep Convolutional Neural Networks Compared with Titan X GPU. IEEE Access, 8, 105455–105471. https://doi.org/10.1109/ACCESS.2020.3000009

A Novel FPGA Accelerator Design for Real-Time and Ultra-Low Power Deep Convolutional Neural Networks Compared with Titan X GPU

Abstract

Author supplied keywords

Cite

Register to see more suggestions