PRBN: A pipelined implementation of RBN for CNN Training

Zhijie Yang; Lei Wang; Xiangyu Zhang; Dong Ding; Chuan Xie; Li Luo

Conference Proceedings

PRBN: A pipelined implementation of RBN for CNN Training

Communications in Computer and Information Science (2020) 1256 CCIS 117-131

DOI: 10.1007/978-981-15-8135-9_9

1Citations

1Readers

Get full text

Abstract

Recently, training CNNs (Convolutional Neural Networks) on-chip has attracted much attention. With the development of the CNNs, the proportion of the BN (Batch Normalization) layer’s execution time is increasing and even exceeds the convolutional layer. The BN layer can accelerate the convergence of training. However, little work focus on the efficient hardware implementation of BN layer computation in training. In this work, we propose an accelerator, PRBN, which supports the BN and convolution computation in training. In our design, a systolic array is used for accelerating the convolution and matrix multiplication in training, and RBN (Range Batch Normalization) array based on hardware-friendly RBN algorithm is implemented for computation of BN layers. We implement PRBN on FPGA PYNQ-Z1. The working frequency of it is 50 MHz and the power of it is 0.346 W. The experimental results show that when compared with CPU i5-7500, PRBN can achieve 3.3$$\times $$ speedup in performance and 8.9$$\times $$ improvement in energy efficiency.

Author supplied keywords

Cite

CITATION STYLE

APA

Yang, Z., Wang, L., Zhang, X., Ding, D., Xie, C., & Luo, L. (2020). PRBN: A pipelined implementation of RBN for CNN Training. In Communications in Computer and Information Science (Vol. 1256 CCIS, pp. 117–131). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-15-8135-9_9

PRBN: A pipelined implementation of RBN for CNN Training

Abstract

Author supplied keywords

Cite

Register to see more suggestions