A Cloud Server Oriented FPGA Accelerator for LSTM Recurrent Neural Network

8Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Long Short-Term Memory network (LSTM) is the most widely used recurrent neural network architecture. It plays an important role in a number of research areas, such as language modeling, machine translation and image captioning. However, owing to its recurrent nature, general-purpose processors like CPUs and GPGPUs can achieve limited parallelism while consuming high power energy. FPGA accelerators can outperform general-purpose processors with flexibility, energy-efficiency and more delicate optimization capabilities for the recurrent based algorithms. In this paper, we present the design and implementation of a cloud-oriented FPGA accelerator for LSTM. Different from most of previous works designed for embedded systems, our FPGA accelerator transfers data sequences from and to the host server through PCIe and performs multiple time series predictions in parallel. We optimize both the on-chip computation and the communication between the host server and the FPGA board. We perform experiments to evaluate the overall performance as well as the computation and the PCIe communication efforts. The results show that the performance of our implementation is better than the CPU-based and other hardware-based implementations.

Cite

CITATION STYLE

APA

Liu, J., Wang, J., Zhou, Y., & Liu, F. (2019). A Cloud Server Oriented FPGA Accelerator for LSTM Recurrent Neural Network. IEEE Access, 7, 122408–122418. https://doi.org/10.1109/ACCESS.2019.2938234

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free