Joint learning of speech-driven facial motion with bidirectional long-short term memory

Najmeh Sadoughi; Carlos Busso

Conference Proceedings

Joint learning of speech-driven facial motion with bidirectional long-short term memory

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10498 LNAI 389-402

DOI: 10.1007/978-3-319-67401-8_49

15Citations

23Readers

Get full text

Abstract

The face conveys a blend of verbal and nonverbal information playing an important role in daily interaction. While speech articulation mostly affects the orofacial areas, emotional behaviors are externalized across the entire face. Considering the relation between verbal and nonverbal behaviors is important to create naturalistic facial movements for conversational agents (CAs). Furthermore, facial muscles connect areas across the face, creating principled relationships and dependencies between the movements that have to be taken into account. These relationships are ignored when facial movements across the face are separately generated. This paper proposes to create speech-driven models that jointly capture the relationship not only between speech and facial movements, but also across facial movements. The input to the models are features extracted from speech that convey the verbal and emotional states of the speakers. We build our models with bidirectional long-short term memory (BLSTM) units which are shown to be very successful in modeling dependencies for sequential data. The objective and subjective evaluations of the results demonstrate the benefits of joint modeling of facial regions using this framework.

Cite

CITATION STYLE

APA

Sadoughi, N., & Busso, C. (2017). Joint learning of speech-driven facial motion with bidirectional long-short term memory. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10498 LNAI, pp. 389–402). Springer Verlag. https://doi.org/10.1007/978-3-319-67401-8_49

Joint learning of speech-driven facial motion with bidirectional long-short term memory

Abstract

Cite

Register to see more suggestions