A review on speech emotion recognition using deep learning and attention mechanism

133Citations
Citations of this article
223Readers
Mendeley users who have this article in their library.

Abstract

Emotions are an integral part of human interactions and are significant factors in determining user satisfaction or customer opinion. speech emotion recognition (SER) modules also play an important role in the development of human–computer interaction (HCI) applications. A tremendous number of SER systems have been developed over the last decades. Attention-based deep neural networks (DNNs) have been shown as suitable tools for mining information that is unevenly time distributed in multimedia content. The attention mechanism has been recently incorporated in DNN architectures to emphasise also emotional salient information. This paper provides a review of the recent development in SER and also examines the impact of various attention mechanisms on SER performance. Overall comparison of the system accuracies is performed on a widely used IEMOCAP benchmark database.

Cite

CITATION STYLE

APA

Lieskovská, E., Jakubec, M., Jarina, R., & Chmulík, M. (2021, May 2). A review on speech emotion recognition using deep learning and attention mechanism. Electronics (Switzerland). MDPI AG. https://doi.org/10.3390/electronics10101163

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free