Speaker-emotion variability is one of the major factors causing the degradation of the performance of speaker recognition system. The difficulty is mainly induced by the shift of the acoustic space, thus the emotional model could not be generated only by neutral utterances. This paper presents a translated learning method which utilizes both the neutral and emotional speech in the development data as translators to build "bridges" between neutral model space and emotional model space. With the help of these translators, GMM emotional model can be produced through its neutral model. The experiments carried on MASC show an IR increase of 2.81% over the GMM-UBM system. © Springer International Publishing 2013.
CITATION STYLE
Chen, L., & Yang, Y. (2013). Emotional speaker recognition based on model space migration through translated learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8232 LNCS, pp. 394–401). https://doi.org/10.1007/978-3-319-02961-0_49
Mendeley helps you to discover research relevant for your work.