Improving dysarthric speech recognition using empirical mode decomposition and convolutional neural network

Mohammed Sidi Yakoub; Sid ahmed Selouani; Brahim Fares Zaidi; Asma Bouchair

Journal ArticleOPEN ACCESS

Improving dysarthric speech recognition using empirical mode decomposition and convolutional neural network

Eurasip Journal on Audio, Speech, and Music Processing (2020) 2020(1)

DOI: 10.1186/s13636-019-0169-5

38Citations

26Readers

Abstract

In this paper, we use empirical mode decomposition and Hurst-based mode selection (EMDH) along with deep learning architecture using a convolutional neural network (CNN) to improve the recognition of dysarthric speech. The EMDH speech enhancement technique is used as a preprocessing step to improve the quality of dysarthric speech. Then, the Mel-frequency cepstral coefficients are extracted from the speech processed by EMDH to be used as input features to a CNN-based recognizer. The effectiveness of the proposed EMDH-CNN approach is demonstrated by the results obtained on the Nemours corpus of dysarthric speech. Compared to baseline systems that use Hidden Markov with Gaussian Mixture Models (HMM-GMMs) and a CNN without an enhancement module, the EMDH-CNN system increases the overall accuracy by 20.72% and 9.95%, respectively, using a k-fold cross-validation experimental setup.

Author supplied keywords

Cite

CITATION STYLE

APA

Sidi Yakoub, M., Selouani, S. ahmed, Zaidi, B. F., & Bouchair, A. (2020). Improving dysarthric speech recognition using empirical mode decomposition and convolutional neural network. Eurasip Journal on Audio, Speech, and Music Processing, 2020(1). https://doi.org/10.1186/s13636-019-0169-5

Improving dysarthric speech recognition using empirical mode decomposition and convolutional neural network

Abstract

Author supplied keywords

Cite

Register to see more suggestions