Abstract
Long Short-Term Memory network (LSTM) is the most widely used recurrent neural network architecture. It plays an important role in a number of research areas, such as language modeling, machine translation and image captioning. However, owing to its recurrent nature, general-purpose processors like CPUs and GPGPUs can achieve limited parallelism while consuming high power energy. FPGA accelerators can outperform general-purpose processors with flexibility, energy-efficiency and more delicate optimization capabilities for the recurrent based algorithms. In this paper, we present the design and implementation of a cloud-oriented FPGA accelerator for LSTM. Different from most of previous works designed for embedded systems, our FPGA accelerator transfers data sequences from and to the host server through PCIe and performs multiple time series predictions in parallel. We optimize both the on-chip computation and the communication between the host server and the FPGA board. We perform experiments to evaluate the overall performance as well as the computation and the PCIe communication efforts. The results show that the performance of our implementation is better than the CPU-based and other hardware-based implementations.
Author supplied keywords
Cite
CITATION STYLE
Liu, J., Wang, J., Zhou, Y., & Liu, F. (2019). A Cloud Server Oriented FPGA Accelerator for LSTM Recurrent Neural Network. IEEE Access, 7, 122408–122418. https://doi.org/10.1109/ACCESS.2019.2938234
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.