Enhanced Speech Emotion Recognition Using DCGAN-Based Data Augmentation

10Citations
Citations of this article
27Readers
Mendeley users who have this article in their library.

Abstract

Although emotional speech recognition has received increasing emphasis in research and applications, it remains challenging due to the diversity and complexity of emotions and limited datasets. To address these limitations, we propose a novel approach utilizing DCGAN to augment data from the RAVDESS and EmoDB databases. Then, we assess the efficacy of emotion recognition using mel-spectrogram data by utilizing a model that combines CNN and BiLSTM. The preliminary experimental results reveal that the suggested technique contributes to enhancing the emotional speech identification performance. The results of this study provide directions for further development in the field of emotional speech recognition and the potential for practical applications.

Cite

CITATION STYLE

APA

Baek, J. Y., & Lee, S. P. (2023). Enhanced Speech Emotion Recognition Using DCGAN-Based Data Augmentation. Electronics (Switzerland), 12(18). https://doi.org/10.3390/electronics12183966

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free