Neural network speaker descriptor in speaker diarization of telephone speech

0Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we have been investigating an approach to a speaker representation for a diarization system that clusters short telephone conversation segments (produced by the same speaker). The proposed approach applies a neural-network-based descriptor that replaces a usual i-vector descriptor in the state-of-the-art diarization systems. The comparison of these two techniques was done on the English part of the CallHome corpus. The final results indicate the superiority of the i-vector’s approach although our proposed descriptor brings an additive information. Thus, the combined descriptor represents a speaker in a segment for diarization purpose with lower diarization error (almost 20% relative improvement compared with only i-vector application).

Cite

CITATION STYLE

APA

Zajíc, Z., Zelinka, J., & Müller, L. (2017). Neural network speaker descriptor in speaker diarization of telephone speech. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10458 LNAI, pp. 555–563). Springer Verlag. https://doi.org/10.1007/978-3-319-66429-3_55

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free