Neural network speaker descriptor in speaker diarization of telephone speech

Zbyněk Zajíc; Jan Zelinka; Luděk Müller

Conference Proceedings

Neural network speaker descriptor in speaker diarization of telephone speech

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10458 LNAI 555-563

DOI: 10.1007/978-3-319-66429-3_55

0Citations

4Readers

Get full text

Abstract

In this paper, we have been investigating an approach to a speaker representation for a diarization system that clusters short telephone conversation segments (produced by the same speaker). The proposed approach applies a neural-network-based descriptor that replaces a usual i-vector descriptor in the state-of-the-art diarization systems. The comparison of these two techniques was done on the English part of the CallHome corpus. The final results indicate the superiority of the i-vector’s approach although our proposed descriptor brings an additive information. Thus, the combined descriptor represents a speaker in a segment for diarization purpose with lower diarization error (almost 20% relative improvement compared with only i-vector application).

Author supplied keywords

Cite

CITATION STYLE

APA

Zajíc, Z., Zelinka, J., & Müller, L. (2017). Neural network speaker descriptor in speaker diarization of telephone speech. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10458 LNAI, pp. 555–563). Springer Verlag. https://doi.org/10.1007/978-3-319-66429-3_55

Neural network speaker descriptor in speaker diarization of telephone speech

Abstract

Author supplied keywords

Cite

Register to see more suggestions