Scores selection for emotional speaker recognition

2Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Emotion variability of the training and testing utterances is one of the largest challenges in speaker recognition. It is a common situation where training data is the neutral speech and testing data is the mixture of neutral and emotional speech. In this paper, we experimentally analyzed the performance of the GMM-based verification system with the utterances in this situation. It reveals that the verification performance improves as the emotion ratio decreases and the scores of neutral features against his/her model are distributed in the upper area than other three scores(neutral against the model of other speakers, and non-neutral speech against the model of himself/herself and other speakers). Based on these, we propose a scores selection method to reduce the emotion ratio of the testing utterance by eliminating the non-neutral features. It is applicable to the GMM-based recognition system without labeling the emotion state in the testing process. The experiments are carried on the MASC Corpus and the performance of the system with scores selection is improved with an EER reduction from 13.52% to 10.17%. © Springer-Verlag Berlin Heidelberg 2009.

Cite

CITATION STYLE

APA

Shan, Z., & Yang, Y. (2009). Scores selection for emotional speaker recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5558 LNCS, pp. 494–502). https://doi.org/10.1007/978-3-642-01793-3_51

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free