iRNN: Integer-only Recurrent Neural Network

Eyyub Sari; Vanessa Courville; Vahid Partovi Nia

Conference ProceedingsOPEN ACCESS

iRNN: Integer-only Recurrent Neural Network

International Conference on Pattern Recognition Applications and Methods (2022) 1 110-121

DOI: 10.5220/0010975700003122

1Citations

6Readers

Get full text

Abstract

Recurrent neural networks (RNN) are used in many real-world text and speech applications. They include complex modules such as recurrence, exponential-based activation, gate interaction, unfoldable normalization, bi-directional dependence, and attention. The interaction between these elements prevents running them on integer-only operations without a significant performance drop. Deploying RNNs that include layer normalization and attention on integer-only arithmetic is still an open problem. We present a quantization-aware training method for obtaining a highly accurate integer-only recurrent neural network (iRNN). Our approach supports layer normalization, attention, and an adaptive piecewise linear approximation of activations (PWL), to serve a wide range of RNNs on various applications. The proposed method is proven to work on RNNbased language models and challenging automatic speech recognition, enabling AI applications on the edge. Our iRNN maintains similar performance as its full-precision counterpart, their deployment on smartphones improves the runtime performance by 2×, and reduces the model size by 4×.

Author supplied keywords

Cite

CITATION STYLE

APA

Sari, E., Courville, V., & Nia, V. P. (2022). iRNN: Integer-only Recurrent Neural Network. In International Conference on Pattern Recognition Applications and Methods (Vol. 1, pp. 110–121). Science and Technology Publications, Lda. https://doi.org/10.5220/0010975700003122

iRNN: Integer-only Recurrent Neural Network

Abstract

Author supplied keywords

Cite

Register to see more suggestions