Unified system for visual speech recognition and speaker identification

11Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper proposes a unified system for both visual speech recognition and speaker identification. The proposed system can handle image and depth data if they are available. The proposed system consists of four consecutive steps, namely, 3D face pose tracking, mouth region extraction, features computing, and classification using the Support Vector Machine method. The system is experimentally evaluated on three public datasets, namely, MIRACL-VC1, OuluVS, and CUAVE. In one hand, the visual speech recognition module achieves up to 96% and 79.2% for speaker dependent and speaker independent settings, respectively. On the other hand, speaker identification performs up to 98.9% of recognition rate. Additionally, the obtained results demonstrate the importance of the depth data to resolve the subject dependency issue.

Cite

CITATION STYLE

APA

Rekik, A., Ben-Hamadou, A., & Mahdi, W. (2015). Unified system for visual speech recognition and speaker identification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9386, pp. 381–390). Springer Verlag. https://doi.org/10.1007/978-3-319-25903-1_33

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free