Diarization of the language consulting center telephone calls

1Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we describe a diarization of the archive data from the project “Access to a Linguistically Structured Database of Enquiries from the Language Consulting Center”. This project is attempting to provide improved access to the large archives of the Czech language of mainly telephone conversations collected continuously by The Language Consulting Center. One part of this archives contains mono recordings, where the data of the client and the language counsellor are mixed in one channel. In our proposed approach to a diarization, we used the information about the identity of the language counsellor acquired from the text transcription on the beginning of the conversation. For the initial stage of the diarization, our system based on clustering the x-vectors was adopted. The resegmentation step is used for refining the boundaries of speaker changes by the pre-trained Gaussian mixture model of the counsellor. Because of the uniqueness of our data, we compared our results with the Kaldi diarization as the baseline system.

Cite

CITATION STYLE

APA

Zajíc, Z., Psutka, J. V., Zajícová, L., Müller, L., & Salajka, P. (2019). Diarization of the language consulting center telephone calls. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11658 LNAI, pp. 549–558). Springer Verlag. https://doi.org/10.1007/978-3-030-26061-3_56

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free