FPGA-accelerated deep convolutional neural networks for high throughput and energy efficiency

45Citations
Citations of this article
65Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Recent breakthroughs in the deep convolutional neural networks (CNNs) have led to great improvements in the accuracy of both vision and auditory systems. Characterized by their deep structures and large numbers of parameters, deep CNNs challenge the computational performance of today. Hardware specialization in the form of field-programmable gate array offers a promising path towards major leaps in computational performance while achieving high-energy efficiency. In this paper, we focus on accelerating deep CNNs using the Xilinx Zynq-zq7045 FPGA SoC. As most of the computational workload can be converted to matrix multiplications, we adopt a matrix multiplier-based accelerator architecture. Dedicated units are designed to eliminate the conversion overhead. We also design a customized memory system according to the memory access pattern of CNNs. To make the accelerator easily usable by application developers, our accelerator supports Caffe, which is a widely used software framework of deep CNN. Different CNN models can be adopted by our accelerator, with good performance portability. The experimental results show that for a typical application of CNN, image classification, an average throughout of 77.8 GFLOPS is achieved, while the energy efficiency is 4.7× better than an Nvidia K20 GPGPU. © 2016 The Authors. Concurrency and Computation: Practice and Experience Published by John Wiley & Sons Ltd.

Cite

CITATION STYLE

APA

Qiao, Y., Shen, J., Xiao, T., Yang, Q., Wen, M., & Zhang, C. (2017). FPGA-accelerated deep convolutional neural networks for high throughput and energy efficiency. In Concurrency and Computation: Practice and Experience (Vol. 29). John Wiley and Sons Ltd. https://doi.org/10.1002/cpe.3850

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free