Specializing FGPU for Persistent Deep Learning

Rui Ma; Jia Ching Hsu; Tian Tan; Eriko Nurvitadhi; David Sheffield; Rob Pelt; Martin Langhammer; Jaewoong Sim; Aravind Dasu; Derek Chiou

Journal ArticleOPEN ACCESS

Specializing FGPU for Persistent Deep Learning

ACM Transactions on Reconfigurable Technology and Systems (2021) 14(2)

DOI: 10.1145/3457886

4Citations

11Readers

Abstract

Overlay architectures are a good way to enable fast development and debug on FPGAs at the expense of potentially limited performance compared to fully customized FPGA designs. When used in concert with hand-tuned FPGA solutions, performant overlay architectures can improve time-to-solution and thus overall productivity of FPGA solutions. This work tunes and specializes FGPU, an open source OpenCL-programmable GPU overlay for FPGAs. We demonstrate that our persistent deep learning (PDL)-FGPU architecture maintains the ease-of-programming and generality of GPU programming while achieving high performance from specialization for the persistent deep learning domain. We also propose an easy method to specialize for other domains. PDL-FGPU includes new instructions, along with micro-architecture and compiler enhancements. We evaluate both the FGPU baseline and the proposed PDL-FGPU on a modern high-end Intel Stratix 10 2800 FPGA in simulation running persistent DL applications (RNN, GRU, LSTM), and non-DL applications to demonstrate generality. PDL-FGPU requires 1.4-3× more ALMs, 4.4-6.4× more M20ks, and 1-9.5× more DSPs than baseline, but improves performance by 56-693× for PDL applications with an average 23.1% degradation on non-PDL applications. We integrated the PDL-FGPU overlay into Intel OPAE to measure real-world performance/power and demonstrate that PDL-FGPU is only 4.0-10.4× slower than the Nvidia V100.

Author supplied keywords

Cite

CITATION STYLE

APA

Ma, R., Hsu, J. C., Tan, T., Nurvitadhi, E., Sheffield, D., Pelt, R., … Chiou, D. (2021). Specializing FGPU for Persistent Deep Learning. ACM Transactions on Reconfigurable Technology and Systems, 14(2). https://doi.org/10.1145/3457886

Specializing FGPU for Persistent Deep Learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions