A high-resolution 3D dynamic faci...
978-1-4244-2154-1/08/$25.00 ��2008 IE Abstract Face information processing relies on the quality of data resource. From the data modality point of view, a face database can be 2D or 3D, and static or dynamic. From the task point of view, the data can be used for research of computer based automatic face recognition, face expression recognition, face detection, or cognitive and psychological investigation. With the advancement of 3D imaging technologies, 3D dynamic facial sequences (called 4D data) have been used for face information analysis. In this paper, we focus on the modality of 3D dynamic data for the task of facial expression recognition. We present a newly created high-resolution 3D dynamic facial expression database, which is made available to the scientific research community. The database contains 606 3D facial expression sequences captured from 101 subjects of various ethnic backgrounds. The database has been validated through our facial expression recognition experiment using an HMM based 3D spatio-temporal facial descriptor. It is expected that such a database shall be used to facilitate the facial expression analysis from a static 3D space to a dynamic 3D space, with a goal of scrutinizing facial behavior at a higher level of detail in a real 3D spatio-temporal domain. 1. Introduction Research on facial information analysis has been intensified recently, driven mainly by its important applications: face recognition (FR) [10, 13, and 43] and facial expression recognition (FER) [5, 7, 18, 20, 21, and 42]. Most research for face analysis utilized conventional 2D static images or 2D dynamic videos [1, 4, 13, 25, 30, 31, 39, and 43]. In recent years, 3D range data has been extensively used for face analysis due to its explicit representation of geometric information and its inherent capability of handling facial pose and illumination variations [2, 12, and 22]. Similar to the 2D modality, 3D face data can also be represented in a static space and a dynamic space. (1) 3D static: Most existing work utilizes the static 3D range data acquired through laser-scanning, stereo photogrammetry, or active light projection [2, 3, 8, 12, 22, 38, and 40]. Many successful works have been reported recently for 3D face recognition [2, 12, 14, 22, and 37]. Most recently, some work has also been reported for 3D facial expression recognition [35, 36, and 41]. For example, Yin et al. have investigated the 3D facial expression recognition [35] using 3D surface primitive features based on the 3D static facial expression database [40]. Wang et al. [36] used the in-house 3D static face models to study the facial expressions in 3D space and 2D space, and achieved encouraging results in identifying facial expression abnormality in schizophrenia. Note that all the data that have been used are still based on static range models. (2) 3D dynamic: Facial expression is by nature a dynamic facial behavior. The 3D dynamic face representation is believed to be the best reflection of this nature. Psychological research shows that facial dynamics provide important cues that can be interpreted in order to represent an individual���s characteristics [27]. The recent findings indicate that the dynamic cues from expressive and talking movements of human faces provide information about individuals' facial structure, and therefore play a great role in facilitating the subsequent recognition [27]. A human face is a bumpy and mobile surface. Neither 2D dynamic data nor 3D static data may be sufficient to depict such a property. 3D static face models lacking a temporal context may be a profound handicap to recognizing facial expressions as well as identifying faces with varied expressions. Recent technological advances in 3D imaging systems allow a high quality 3D shape to be acquired in real time [3, 8, and 38]. Such 3D dynamic data (or so-called 4D data) captures the dynamics of time-varying 3D facial surfaces, making it possible to analyze the dynamic facial behavior in a 3D spatio-temporal domain. It is conceivable that more information concerning the individual���s characteristics or expressive traits can be derived from 3D dynamic sequences. There were a few works reported using 4D data for facial expression analysis. For example, Wang et al. [38] successfully developed a hierarchical framework for tracking high-density 3D facial expression sequences captured from a structure-lighting imaging system. Recent work reported by Chang and Turk et al. in [3] utilized 3D model sequences for expression analysis and editing. The work is accomplished through usage of a probabilistic model on the generalized expression manifold of the standard model. Notice that the existing reported works A High-Resolution 3D Dynamic Facial Expression Database Lijun Yin, Xiaochen Chen, Yi Sun, Tony Worm, and Michael Reale Department of Computer Science State University of New York at Binghamton Authorized licensed use limited to: CAMBRIDGE UNIV. Downloaded on November 2, 2009 at 05:04 from IEEE Xplore. Restrictions apply.