Audio-to-visual conversion via HMM inversion for speech-driven facial animation

13Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, the inversion of a joint Audio-Visual Hidden Markov Model is proposed to estimate the visual information from speech data in a speech driven MPEG-4 compliant facial animation system. The inversion algorithm is derived for the general case of considering full covariance matrices for the audio-visual observations. The system performance is evaluated for the cases of full and diagonal covariance matrices. Experimental results show that full covariance matrices are preferable since similar, to the case of using diagonal matrices, performance can be achieved using a less complex model. The experiments are carried out using audio-visual databases compiled by the authors. © 2008 Springer Berlin Heidelberg.

Cite

CITATION STYLE

APA

Terissi, L. D., & Gómez, J. C. (2008). Audio-to-visual conversion via HMM inversion for speech-driven facial animation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5249 LNAI, pp. 33–42). Springer Verlag. https://doi.org/10.1007/978-3-540-88190-2_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free