Probabalistic models and informative subspaces for audiovisual correspondence

9Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We propose a probabalistic model of single source multimodal generation and show how algorithms for maximizing mutual information can find the correspondences between components of each signal. We show how non-parametric techniques for finding informative subspaces can capture the complex statistical relationship between signals in different modalities. We extend a previous technique for finding informative subspaces to include new priors on the projection weights, yielding more robust results. Applied to human speakers, our model can find the relationship between audio speech and video of facial motion, and partially segment out background events in both channels. We present new results on the problem of audio-visual verification, and show how the audio and video of a speaker can be matched even when no prior model of the speaker’s voice or appearance is available.

Cite

CITATION STYLE

APA

Fisher, J. W., & Darrell, T. (2002). Probabalistic models and informative subspaces for audiovisual correspondence. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2352, pp. 592–603). Springer Verlag. https://doi.org/10.1007/3-540-47977-5_39

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free