Performance Evaluation of Deep neural networks Applied to Speech Recognition: Rnn, LSTM and GRU

Apeksha Shewalkar; Deepika nyavanandi; Simone A. Ludwig

Journal ArticleOPEN ACCESS

Performance Evaluation of Deep neural networks Applied to Speech Recognition: Rnn, LSTM and GRU

Journal of Artificial Intelligence and Soft Computing Research (2019) 9(4) 235-245

DOI: 10.2478/jaiscr-2019-0006

567Citations

416Readers

Abstract

Deep neural networks (Dnn) are nothing but neural networks with many hidden layers. Dnns are becoming popular in automatic speech recognition tasks which combines a good acoustic with a language model. Standard feedforward neural networks cannot handle speech data well since they do not have a way to feed information from a later layer back to an earlier layer. Thus, Recurrent neural networks (Rnns) have been introduced to take temporal dependencies into account. However, the shortcoming of Rnns is that long-term dependencies due to the vanishing/exploding gradient problem cannot be handled. Therefore, Long Short-Term Memory (LSTM) networks were introduced, which are a special case of Rnns, that takes long-term dependencies in a speech in addition to short-term dependencies into account. Similarily, GRU (Gated Recurrent Unit) networks are an improvement of LSTM networks also taking long-term dependencies into consideration. Thus, in this paper, we evaluate Rnn, LSTM, and GRU to compare their performances on a reduced TED-LIUM speech data set. The results show that LSTM achieves the best word error rates, however, the GRU optimization is faster while achieving word error rates close to LSTM.

Author supplied keywords

Cite

CITATION STYLE

APA

Shewalkar, A., nyavanandi, D., & Ludwig, S. A. (2019). Performance Evaluation of Deep neural networks Applied to Speech Recognition: Rnn, LSTM and GRU. Journal of Artificial Intelligence and Soft Computing Research, 9(4), 235–245. https://doi.org/10.2478/jaiscr-2019-0006

Performance Evaluation of Deep neural networks Applied to Speech Recognition: Rnn, LSTM and GRU

Abstract

Author supplied keywords

Cite

Register to see more suggestions