3D Mouth Tracking from a Compact Microphone Array Co-Located with a camera

17Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We address the 3D audio-visual mouth tracking problem when using a compact platform with co-located audio-visual sensors, without a depth camera. In particular, we propose a multi-modal particle filter that combines a face detector and 3D hypothesis mapping to the image plane. The audio likelihood computation is assisted by video, which relies on a GCC-PHAT based acoustic map. By combining audio and video inputs, the proposed approach can cope with a reverberant and noisy environment, and can deal with situations when the person is occluded, outside the Field of View (FoV), or not facing the sensors. Experimental results show that the proposed tracker is accurate both in 3D and on the image plane.

Cite

CITATION STYLE

APA

Qian, X., Xompero, A., Cavallaro, A., Brutti, A., Lanz, O., & Omologo, M. (2018). 3D Mouth Tracking from a Compact Microphone Array Co-Located with a camera. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (Vol. 2018-April, pp. 3071–3075). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/ICASSP.2018.8461323

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free