Degraded signal quality and incomplete voice probes have severe effects on the performance of a speaker recognition system. Unified audio characteristics (UACs) have been proposed to quantify multi-condition signal degradation effects into posterior probabilities of quality classes. In previous work, we showed that UAC-based quality vectors (q-vectors) are efficient at the score-normalization stage. Hence, we motivate q-vector based calibration by using functions of quality estimates (FQEs). In this work, we examine the robustness of calibration approaches to low-SNR and short-duration conditions utilizing measured and estimated quality indicators. Thereby, comparisons are drawn to quality measure functions (QMFs) employing oracle SNRs and sample duration. In the robustness study, low-SNR and short-duration conditions are excluded from calibration training. The present analysis provides insights on the behavior of calibration schemes in combined conditions of high signal degradation and short segment duration regarding accurate approximation of idealized calibration. We seek calibration methods in order to parsimonious preserve robustness against unseen data. A separate analysis is provided on duration- and noise-only scenarios as well as on combined duration and noise scenarios. QMFs and FQE reduce Cmc costs down to 5 - 6% of conventional calibration schemes if all conditions are known, and to 10 - 12% in the presence of unseen conditions.
CITATION STYLE
Nautsch, A., Saeidi, R., Rathgeb, C., & Busch, C. (2016). Robustness of quality-based score calibration of speaker recognition systems with respect to low-SNR and short-duration conditions. In Odyssey 2016: Speaker and Language Recognition Workshop (pp. 358–365). International Speech Communication Association. https://doi.org/10.21437/Odyssey.2016-52
Mendeley helps you to discover research relevant for your work.