Unified system for visual speech recognition and speaker identification

Ahmed Rekik; Achraf Ben-Hamadou; Walid Mahdi

Book Chapter

Unified system for visual speech recognition and speaker identification

Springer Verlag, (2015), 381-390

DOI: 10.1007/978-3-319-25903-1_33

11Citations

7Readers

Get full text

Abstract

This paper proposes a unified system for both visual speech recognition and speaker identification. The proposed system can handle image and depth data if they are available. The proposed system consists of four consecutive steps, namely, 3D face pose tracking, mouth region extraction, features computing, and classification using the Support Vector Machine method. The system is experimentally evaluated on three public datasets, namely, MIRACL-VC1, OuluVS, and CUAVE. In one hand, the visual speech recognition module achieves up to 96% and 79.2% for speaker dependent and speaker independent settings, respectively. On the other hand, speaker identification performs up to 98.9% of recognition rate. Additionally, the obtained results demonstrate the importance of the depth data to resolve the subject dependency issue.

Author supplied keywords

Cite

CITATION STYLE

APA

Rekik, A., Ben-Hamadou, A., & Mahdi, W. (2015). Unified system for visual speech recognition and speaker identification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9386, pp. 381–390). Springer Verlag. https://doi.org/10.1007/978-3-319-25903-1_33

Unified system for visual speech recognition and speaker identification

Abstract

Author supplied keywords

Cite

Register to see more suggestions