Feature learning via deep belief network for Chinese speech emotion recognition

7Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Speech emotion recognition is an interesting and challenging subject due to the emotion gap between speech signals and high-level speech emotion. To bridge this gap, this paper present a method of Chinese speech emotion recognition using Deep belief networks (DBN). DBN is used to perform unsupervised feature learning on the extracted low-level acoustic features. Then, Multi-layer Perceptron (MLP) is initialized in terms of the learning results of hidden layer of DBN, and employed for Chinese speech emotion classification. Experimental results on the Chinese Natural Audio-Visual Emotion Database (CHEAVD), show that the presented method obtains a classification accuracy of 32.80 % and macro average precision of 41.54 % on the testing data from the CHEAVD dataset on speech emotion recognition tasks, significantly outperforming the baseline results provided by the organizers in the speech emotion recognition sub-challenges.

Cite

CITATION STYLE

APA

Zhang, S., Zhao, X., Chuang, Y., Guo, W., & Chen, Y. (2016). Feature learning via deep belief network for Chinese speech emotion recognition. In Communications in Computer and Information Science (Vol. 663, pp. 645–651). Springer Verlag. https://doi.org/10.1007/978-981-10-3005-5_53

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free