The GMM-SVM supervector approach for the recognition of the emotional status from speech

22Citations
Citations of this article
20Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Emotion recognition from speech is an important field of research in human-machine-interfaces, and has various applications, for instance for call centers. In the proposed classifier system RASTA-PLP features (perceptual linear prediction) are extracted from the speech signals. The first step is to compute an universal background model (UBM) representing a general structure of the underlying feature space of speech signals. This UBM is modeled as a Gaussian mixture model (GMM). After computing the UBM the sequence of feature vectors extracted from the utterance is used to re-train the UBM. From this GMM the mean vectors are extracted and concatenated to the so-called GMM supervectors which are then applied to a support vector machine classifier. The overall system has been evaluated by using utterances from the public Berlin emotional database. Utilizing the proposed features a recognition rate of 79% (utterance based) has been achieved which is close to the performance of humans on this database. © 2009 Springer Berlin Heidelberg.

Cite

CITATION STYLE

APA

Schwenker, F., Scherer, S., Magdi, Y. M., & Palm, G. (2009). The GMM-SVM supervector approach for the recognition of the emotional status from speech. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5768 LNCS, pp. 894–903). https://doi.org/10.1007/978-3-642-04274-4_92

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free