Multimodal speaker identification based on text and speech

Panagiotis Moschonas; Constantine Kotropoulos

Conference Proceedings

Multimodal speaker identification based on text and speech

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2008) 5372 LNCS 100-109

DOI: 10.1007/978-3-540-89991-4_11

0Citations

5Readers

Get full text

Abstract

This paper proposes a novel method for speaker identification based on both speech utterances and their transcribed text. The transcribed text of each speaker's utterance is processed by the probabilistic latent semantic indexing (PLSI) that offers a powerful means to model each speaker's vocabulary employing a number of hidden topics, which are closely related to his/her identity, function, or expertise. Mel-frequency cepstral coefficients (MFCCs) are extracted from each speech frame and their dynamic range is quantized to a number of predefined bins in order to compute MFCC local histograms for each speech utterance, which is time-aligned with the transcribed text. Two identity scores are independently computed by the PLSI applied to the text and the nearest neighbor classifier applied to the local MFCC histograms. It is demonstrated that a convex combination of the two scores is more accurate than the individual scores on speaker identification experiments conducted on broadcast news of the RT-03 MDE Training Data Text and Annotations corpus distributed by the Linguistic Data Consortium. © 2008 Springer Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Moschonas, P., & Kotropoulos, C. (2008). Multimodal speaker identification based on text and speech. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5372 LNCS, pp. 100–109). https://doi.org/10.1007/978-3-540-89991-4_11

Multimodal speaker identification based on text and speech

Abstract

Author supplied keywords

Cite

Register to see more suggestions