We present a new system-level framework for the automatic head detection and tracking of multiple people in intelligent meeting rooms. We implement this approach with a distributed array of cameras that detect the meeting participants and continuously estimate their head orientation and head movements in 6 degrees-of-freedom with fine precision. The initial position of each person is obtained with a set of face detectors coupled with a new iterative approach to resolve the 3D ambiguities from overlapping epipolar lines. The head pose is obtained from a hybrid head pose estimation and tracking scheme that combines support vector regressors with a new multi-view 3D model-based tracking system. The purpose of this system is to facilitate the automatic semantic analysis of group meetings. As an example application, we evaluate the ability of the system to estimate the person that receives the most visual attention in the form of head direction.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below