This paper describes a multimodal approach for speaker verification. The system consists of two classifiers, one using visual features and the other using acoustic features. A lip tracker is used to extract visual information from the speaking face which provides shape and intensity features. We describe an approach for normalizing and mapping different modalities onto a common confidence interval. We also describe a novel method for integrating the scores of multiple classifiers. Verification experiments are reported for the individual modalities and for the combined classifier. The performance of the integrated system out-performed each sub-system and reduced the false acceptance rate of the acoustic sub-system from 2.3% to 0.5%.
CITATION STYLE
Jourlin, P., Luettin, J., Genoud, D., & Wassner, H. (1997). Acoustic-labial speaker verification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1206, pp. 319–326). Springer Verlag. https://doi.org/10.1007/bfb0016011
Mendeley helps you to discover research relevant for your work.