The paper presents a set of experiments on pathological voice detection over the Saarbrücken Voice Database (SVD) by using the MultiFocal toolkit for a discriminative calibration and fusion. The SVD is freely available online containing a collection of voice recordings of different pathologies, including both functional and organic. A generative Gaussian mixture model trained with mel-frequency cepstral coefficients, harmonics-to-noise ratio, normalized noise energy and glottal-to-noise excitation ratio, is used as classifier. Scores are calibrated to increase performance at the desired operating point. Finally, the fusion of different recordings for each speaker, in which vowels /a/, /i/ and /u/ are pronounced with normal, low, high, and low-high-low intonations, offers a great increase in the performance. Results are compared with the Massachusetts Eye and Ear Infirmary (MEEI) database, which makes possible to see that SVD is much more challenging. © 2012 Springer-Verlag.
CITATION STYLE
Martínez, D., Lleida, E., Ortega, A., Miguel, A., & Villalba, J. (2012). Advances in Speech and Language Technologies for Iberian Languages. Communications in Computer and Information Science, 328(November 2014), 99–109. https://doi.org/10.1007/978-3-642-35292-8
Mendeley helps you to discover research relevant for your work.