Speech-driven facial animation using a shared Gaussian process latent variable model

N/ACitations
Citations of this article
23Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this work, synthesis of facial animation is done by modelling the mapping between facial motion and speech using the shared Gaussian process latent variable model. Both data are processed separately and subsequently coupled together to yield a shared latent space. This method allows coarticulation to be modelled by having a dynamical model on the latent space. Synthesis of novel animation is done by first obtaining intermediate latent points from the audio data and then using a Gaussian Process mapping to predict the corresponding visual data. Statistical evaluation of generated visual features against ground truth data compares favourably with known methods of speech animation. The generated videos are found to show proper synchronisation with audio and exhibit correct facial dynamics. © 2009 Springer-Verlag.

Cite

CITATION STYLE

APA

Deena, S., & Galata, A. (2009). Speech-driven facial animation using a shared Gaussian process latent variable model. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5875 LNCS, pp. 89–100). https://doi.org/10.1007/978-3-642-10331-5_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free