Speech emotion recognition using multi-granularity feature fusion through auditory cognitive mechanism

Cong Xu; Haifeng Li; Hongjian Bo; Lin Ma

Conference ProceedingsOPEN ACCESS

Speech emotion recognition using multi-granularity feature fusion through auditory cognitive mechanism

Xu C
Li H
Bo H
et al.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11518 LNCS 117-131

DOI: 10.1007/978-3-030-23407-2_10

1Citations

8Readers

Abstract

In this paper, we focus on the problems of single granularity in feature extraction, loss of temporal information and inefficient use of frame features in discrete speech emotion recognition. Firstly, preliminary cognitive mechanism of auditory emotion is explored through cognitive experiments, and then a multi-granularity fusion feature extraction method inspired by the mechanism for discrete emotional speech signals is proposed. The method can extract 3 different granularity features, including short-term dynamic features of frame granularity, dynamic features of segment granularity and long-term static features of global granularity. Finally, we use the LSTM network model to classify emotions according to the long-term and short-term characteristics of the fusion features. We implement experiment on the discrete emotion datasets of CHEAVD (CASIA Chinese Emotional Audio-Visual Database) released by the Institute of automation, China Research Academy of Sciences, and achieved improvement in recognition rate, increasing the MAP by 6.48%.

Author supplied keywords

Cite

CITATION STYLE

APA

Xu, C., Li, H., Bo, H., & Ma, L. (2019). Speech emotion recognition using multi-granularity feature fusion through auditory cognitive mechanism. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11518 LNCS, pp. 117–131). Springer Verlag. https://doi.org/10.1007/978-3-030-23407-2_10

Speech emotion recognition using multi-granularity feature fusion through auditory cognitive mechanism

Abstract

Author supplied keywords

Cite

Register to see more suggestions