Scores selection for emotional speaker recognition

Zhenyu Shan; Yingchun Yang

Conference ProceedingsOPEN ACCESS

Scores selection for emotional speaker recognition

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2009) 5558 LNCS 494-502

DOI: 10.1007/978-3-642-01793-3_51

2Citations

3Readers

Abstract

Emotion variability of the training and testing utterances is one of the largest challenges in speaker recognition. It is a common situation where training data is the neutral speech and testing data is the mixture of neutral and emotional speech. In this paper, we experimentally analyzed the performance of the GMM-based verification system with the utterances in this situation. It reveals that the verification performance improves as the emotion ratio decreases and the scores of neutral features against his/her model are distributed in the upper area than other three scores(neutral against the model of other speakers, and non-neutral speech against the model of himself/herself and other speakers). Based on these, we propose a scores selection method to reduce the emotion ratio of the testing utterance by eliminating the non-neutral features. It is applicable to the GMM-based recognition system without labeling the emotion state in the testing process. The experiments are carried on the MASC Corpus and the performance of the system with scores selection is improved with an EER reduction from 13.52% to 10.17%. © Springer-Verlag Berlin Heidelberg 2009.

Author supplied keywords

Cite

CITATION STYLE

APA

Shan, Z., & Yang, Y. (2009). Scores selection for emotional speaker recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5558 LNCS, pp. 494–502). https://doi.org/10.1007/978-3-642-01793-3_51

Scores selection for emotional speaker recognition

Abstract

Author supplied keywords

Cite

Register to see more suggestions