Real-Time Speech Enhancement Based on Convolutional Recurrent Neural Network

8Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

Abstract

Speech enhancement is the task of taking a noisy speech input and pro-ducing an enhanced speech output. In recent years, the need for speech enhancement has been increased due to challenges that occurred in various applications such as hearing aids, Automatic Speech Recognition (ASR), and mobile speech communication systems. Most of the Speech Enhancement research work has been carried out for English, Chinese, and other European languages. Only a few research works involve speech enhancement in Indian regional Languages. In this paper, we propose a two-fold architecture to perform speech enhancement for Tamil speech signal based on convolutional recurrent neural network (CRN) that addresses the speech enhancement in a real-time single channel or track of sound created by the speaker. In the first stage mask based long short-term memory (LSTM) is used for noise suppression along with loss function and in the sec-ond stage, Convolutional Encoder-Decoder (CED) is used for speech restoration. The proposed model is evaluated on various speaker and noisy environments like Babble noise, car noise, and white Gaussian noise. The proposed CRN model improves speech quality by 0.1 points when compared with the LSTM base model and also CRN requires fewer parameters for training. The performance of the proposed model is outstanding even in low Signal to Noise Ratio (SNR).

Cite

CITATION STYLE

APA

Girirajan, S., & Pandian, A. (2023). Real-Time Speech Enhancement Based on Convolutional Recurrent Neural Network. Intelligent Automation and Soft Computing, 35(2), 1987–2001. https://doi.org/10.32604/iasc.2023.028090

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free