Improving dysarthric speech recognition using empirical mode decomposition and convolutional neural network

38Citations
Citations of this article
26Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In this paper, we use empirical mode decomposition and Hurst-based mode selection (EMDH) along with deep learning architecture using a convolutional neural network (CNN) to improve the recognition of dysarthric speech. The EMDH speech enhancement technique is used as a preprocessing step to improve the quality of dysarthric speech. Then, the Mel-frequency cepstral coefficients are extracted from the speech processed by EMDH to be used as input features to a CNN-based recognizer. The effectiveness of the proposed EMDH-CNN approach is demonstrated by the results obtained on the Nemours corpus of dysarthric speech. Compared to baseline systems that use Hidden Markov with Gaussian Mixture Models (HMM-GMMs) and a CNN without an enhancement module, the EMDH-CNN system increases the overall accuracy by 20.72% and 9.95%, respectively, using a k-fold cross-validation experimental setup.

Cite

CITATION STYLE

APA

Sidi Yakoub, M., Selouani, S. ahmed, Zaidi, B. F., & Bouchair, A. (2020). Improving dysarthric speech recognition using empirical mode decomposition and convolutional neural network. Eurasip Journal on Audio, Speech, and Music Processing, 2020(1). https://doi.org/10.1186/s13636-019-0169-5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free