A study on a speech emotion recognition system with effective acoustic features using deep learning algorithms

31Citations
Citations of this article
58Readers
Mendeley users who have this article in their library.

Abstract

The goal of the human interface is to recognize the user’s emotional state precisely. In the speech emotion recognition study, the most important issue is the effective parallel use of the extraction of proper speech features and an appropriate classification engine. Well defined speech databases are also needed to accurately recognize and analyze emotions from speech signals. In this work, we constructed a Korean emotional speech database for speech emotion analysis and proposed a feature combination that can improve emotion recognition performance using a recurrent neural network model. To investigate the acoustic features, which can reflect distinct momentary changes in emotional expression, we extracted F0, Mel-frequency cepstrum coefficients, spectral features, harmonic features, and others. Statistical analysis was performed to select an optimal combination of acoustic features that affect the emotion from speech. We used a recurrent neural network model to classify emotions from speech. The results show the proposed system has more accurate performance than previous studies.

Cite

CITATION STYLE

APA

Byun, S. W., & Lee, S. P. (2021). A study on a speech emotion recognition system with effective acoustic features using deep learning algorithms. Applied Sciences (Switzerland), 11(4), 1–15. https://doi.org/10.3390/app11041890

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free