Time-frequency deep representation learning for speech emotion recognition integrating self-attention

Jiaxing Liu; Zhilei Liu; Longbiao Wang; Lili Guo; Jianwu Dang

Conference Proceedings

Time-frequency deep representation learning for speech emotion recognition integrating self-attention

Communications in Computer and Information Science (2019) 1142 CCIS 681-689

DOI: 10.1007/978-3-030-36808-1_74

6Citations

14Readers

Get full text

Abstract

Learning efficient deep representations from spectrogram for speech emotion recognition still represents a significant challenge. Most existing spectrogram feature extraction methods empowered by deep learning have demonstrated great success, but the respective changing information of time and frequency exhibited by the spectrogram is ignored. In this paper, a speech emotion recognition method integrating self-attention is proposed by considering the interactive and respective changing information of time and frequency. To learn the deep representations from spectrogram, a time-frequency convolutional neural network (TFCNN) is proposed at first. After that, a Multi-head Self-attention layer inspired by Transformer proposed by Google is introduced to fuse deep representations more efficiently. Finally, extreme learning machine (ELM) and bidirectional long short term memory (BLSTM) models are adopted as emotion classifiers. Experiments conducted on IEMOCAP dataset demonstrate the effectiveness of our proposed methods showing better visual illustrations and classification results.

Author supplied keywords

Cite

CITATION STYLE

APA

Liu, J., Liu, Z., Wang, L., Guo, L., & Dang, J. (2019). Time-frequency deep representation learning for speech emotion recognition integrating self-attention. In Communications in Computer and Information Science (Vol. 1142 CCIS, pp. 681–689). Springer. https://doi.org/10.1007/978-3-030-36808-1_74

Time-frequency deep representation learning for speech emotion recognition integrating self-attention

Abstract

Author supplied keywords

Cite

Register to see more suggestions