Approximate FPGA-based LSTMs under computation time constraints

15Citations
Citations of this article
47Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Recurrent Neural Networks, with the prominence of Long Short-Term Memory (LSTM) networks, have demonstrated state-of-the-art accuracy in several emerging Artificial Intelligence tasks. Nevertheless, the highest performing LSTM models are becoming increasingly demanding in terms of computational and memory load. At the same time, emerging latency-sensitive applications including mobile robots and autonomous vehicles often operate under stringent computation time constraints. In this paper, we address the challenge of deploying computationally demanding LSTMs at a constrained time budget by introducing an approximate computing scheme that combines iterative low-rank compression and pruning, along with a novel FPGA-based LSTM architecture. Combined in an end-to-end framework, the approximation method parameters are optimised and the architecture is configured to address the problem of high-performance LSTM execution in time-constrained applications. Quantitative evaluation on a real-life image captioning application indicates that the proposed system required up to 6.5 × less time to achieve the same application-level accuracy compared to a baseline method, while achieving an average of 25 × higher accuracy under the same computation time constraints.

Cite

CITATION STYLE

APA

Rizakis, M., Venieris, S. I., Kouris, A., & Bouganis, C. S. (2018). Approximate FPGA-based LSTMs under computation time constraints. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10824 LNCS, pp. 3–15). Springer Verlag. https://doi.org/10.1007/978-3-319-78890-6_1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free