NIST RT'05S evaluation: Pre-processing techniques and speaker diarization on multiple microphone meetings

13Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents different pre-processing techniques, coupled with three speaker diarization systems in the framework of the NIST 2005 Spring Rich Transcription campaign (RT'05S). The pre-processing techniques aim at providing a signal quality index in order to build a unique "virtual" signal obtained from all the microphone recordings available for a meeting. This unique virtual signal relies on a weighted sum of the different microphone signals while the signal quality index is given according to a signal to noise ratio. Two methods are used in this paper to compute the instantaneous signal to noise ratio: a speech activity detection based approach and a noise spectrum estimate. The speaker diarization task is performed using systems developed by different labs: the LIA, LIUM and CLIPS. Among the different system submissions made by these three labs, the best system obtained 24.5 % speaker diarization error for the conference subdomain and 18.4 % for the lecture subdomain. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Istrate, D., Fredouille, C., Meignier, S., Besacier, L., & Bonastre, J. F. (2006). NIST RT’05S evaluation: Pre-processing techniques and speaker diarization on multiple microphone meetings. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3869 LNCS, pp. 428–439). https://doi.org/10.1007/11677482_36

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free