High-Frequency Systolic Array-Based Transformer Accelerator on Field Programmable Gate Arrays

12Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

The systolic array is frequently used in accelerators for neural networks, including Transformer models that have recently achieved remarkable progress in natural language processing (NLP) and machine translation. Due to the constraints of FPGA EDA (Field Programmable Gate Array Electronic Design Automation) tools and the limitations of design methodology, existing systolic array accelerators for FPGA deployment often cannot achieve high frequency. In this work, we propose a well-designed high-frequency systolic array for an FPGA-based Transformer accelerator, which is capable of performing the Multi-Head Attention (MHA) block and the position-wise Feed-Forward Network (FFN) block, reaching 588 MHz and 474 MHz for different array size, achieving a frequency improvement of 1.8× and 1.5× on a Xilinx ZCU102 board, while drastically saving resources compared to similar recent works and pushing the utilization of each DSP slice to a higher level. We also propose a semi-automatic design flow with constraint-generating tools as a general solution for FPGA-based high-frequency systolic array deployment.

Cite

CITATION STYLE

APA

Chen, Y., Li, T., Chen, X., Cai, Z., & Su, T. (2023). High-Frequency Systolic Array-Based Transformer Accelerator on Field Programmable Gate Arrays. Electronics (Switzerland), 12(4). https://doi.org/10.3390/electronics12040822

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free