Multimodal People ID for a Multimedia Meeting Browser

3Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A meeting browser is a system that allows users to review a multimedia meeting record from a variety of indexing methods. Identification of meeting participants is essential for creating such a multimedia meeting record. Moreover, knowing who is speaking can enhance the performance of speech recognition and indexing meeting transcription. In this paper, we present an approach that identifies meeting participants by fusing multimodal inputs. We use face ID, speaker ID, color appearance ID, and sound source directional ID to identify and track meeting. After describing the different modules in detail, we will discuss a framework for combining the information sources. Integration of the multimodal people ID into the multimedia meeting browser is in its preliminary stage.

Cite

CITATION STYLE

APA

Yang, J., Zhu, X., Gross, R., Kominek, J., Pan, Y., & Waibel, A. (1999). Multimodal People ID for a Multimedia Meeting Browser. In MULTIMEDIA 1999 - Proceedings of the 7th ACM International Conference on Multimedia (Part 1) (Vol. 1, pp. 159–168). Association for Computing Machinery, Inc. https://doi.org/10.1145/319463.319484

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free