Audio-Visual Tracking for Natural Interactivity

N/ACitations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

The goal in user interfaces is natural interactivity unencumbered by sensor and display technology. In this paper, we propose that a multi-modal approach using inverse modeling techniques from computer vision, speech recognition, and acoustics can result in such interfaces. In particular, we demonstrate a system for audiovisual tracking, showing that such a system is more robust, more accurate, more compact, and yields more information than using a single modality for tracking. We also demonstrate how such a system can be used to find the talker among a group of individuals, and render 3D scenes to the user.

Cite

CITATION STYLE

APA

Pingali, G., Tunali, G., & Carlbom, I. (1999). Audio-Visual Tracking for Natural Interactivity. In MULTIMEDIA 1999 - Proceedings of the 7th ACM International Conference on Multimedia (Part 1) (Vol. 1, pp. 373–382). Association for Computing Machinery, Inc. https://doi.org/10.1145/319463.319652

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free