Long Short-Term Memory Recurrent Neural Network for Automatic Speech Recognition

155Citations
Citations of this article
225Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Automatic speech recognition (ASR) is one of the most demanding tasks in natural language processing owing to its complexity. Recently, deep learning approaches have been deployed for this task and have been proven to outperform traditional machine learning approaches such as Artificial Neural Network (ANN). In particular, deep-learning methods such as long short-term memory (LSTM) have achieved improved ASR performance. However, this method is limited to processing continuous input streams. Traditional LSTM requires four (4) linear layers (multilayer perceptron (MLP) layer) per cell with a large memory bandwidth for each sequence time step. LSTM cannot accommodate the many computational units required for processing continuous input streams because the system does not have sufficient memory bandwidth to feed the computational units. In this study, an enhanced deep learning LSTM recurrent neural network (RNN) model was proposed to resolve this shortcoming. In the proposed model, the RNN is incorporated as a "forget gate"to the memory block to allow the resetting of cell states at the beginning of the sub-sequences. This enables the system to process continuous input streams efficiently without necessarily increasing the required bandwidths. In the proposed model, the standard architecture of the LSTM network is modified to effectively use the model parameters. Some CNN-based and sequential models were used on the same dataset, and the models were compared with the proposed model. LSTM-RNN outperformed the other deep learning models with an accuracy of 99.36% on the well-established public benchmark spoken English digit dataset.

Cite

CITATION STYLE

APA

Oruh, J., Viriri, S., & Adegun, A. (2022). Long Short-Term Memory Recurrent Neural Network for Automatic Speech Recognition. IEEE Access, 10, 30069–30079. https://doi.org/10.1109/ACCESS.2022.3159339

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free