A high speed reconfigurable architecture for softmax and GELU in vision transformer

16Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

Transformers have been widely used in various computer vision applications. Compared to traditional convolutional neural networks (CNNs), transformer's inference includes plenty of non-linear operations, such as softmax and Gaussian error linear units (GELU). As the scale of transformers grows, an efficient hardware implementation of these operations is significant. However, the current works of computer vision neural network accelerators focus on CNN and less attention is paid to transformer. In addition, most current FPGA-based softmax or GELU accelerators are not designed for vision transformer (ViT). To solve this problem, this work proposes a high speed reconfigurable accelerator. The architecture can support both softmax and GELU functions in ViT by reconfiguring the data path. This architecture on Xilinx XCVU37P is implemented through mathematical transformation and hardware optimization design, and achieve the performance of 102.4 Giga bits per second (Gbps) at 200 MHz. Experimental results show that the architecture achieves a very small accuracy loss in the ViT's inference by using fixed-point 16-bit quantization. Compared with existing accelerators, the design has greater throughput and area efficiency.

Cite

CITATION STYLE

APA

Li, T., Zhang, F., Xie, G., Fan, X., Gao, Y., & Sun, M. (2023). A high speed reconfigurable architecture for softmax and GELU in vision transformer. Electronics Letters, 59(5). https://doi.org/10.1049/ell2.12751

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free