We introduce a novel approach to modeling the dynamics of human facial motion induced by the action of speech for the purpose of synthesis. We represent the trajectories of a number of salient features on the human face as the output of a dynamical system made up of two subsystems, one driven by the deterministic speech input, and a second driven by an unknown stochastic input. Inference of the model (learning) is performed automatically and involves an extension of independent component analysis to time-depentend data. Using a shapetexture decompositional representation for the face, we generate facial image sequences reconstructed from synthesized feature point positions. © Springer-Verlag 2004.
CITATION STYLE
Saisan, P., Bissacco, A., Chiuso, A., & Soatto, S. (2004). Modeling and synthesis of facial motion driven by speech. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3023, 456–467. https://doi.org/10.1007/978-3-540-24672-5_36
Mendeley helps you to discover research relevant for your work.