The automatic processing and estimation of view direction and head pose in interactive scenarios is an actively investigated research topic in the development of advanced human-computer or human-robot interfaces. Still, current state of the art approaches often make rigid assumptions concerning the scene illumination and viewing distance in order to achieve stable results. In addition, there is a lack of rigorous evaluation criteria to compare different computational vision approaches and to judge their flexibility. In this work, we make a step towards the employment of robust computational vision mechanisms to estimate the actor's head pose and thus the direction of his focus of attention. We propose a domain specific mechanism based on learning to estimate stereo correspondences of image pairs. Furthermore, in order to facilitate the evaluation of computational vision results, we present a data generation framework capable of image synthesis under controlled pose conditions using an arbitrary camera setup with a free number of cameras. We show some computational results of our proposed mechanism as well as an evaluation based on the available reference data. © 2011 Springer-Verlag.
CITATION STYLE
Layher, G., Liebau, H., Niese, R., Al-Hamadi, A., Michaelis, B., & Neumann, H. (2011). Robust stereoscopic head pose estimation in human-computer interaction and a unified evaluation framework. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6978 LNCS, pp. 227–236). https://doi.org/10.1007/978-3-642-24085-0_24
Mendeley helps you to discover research relevant for your work.