Audio and video feature fusion for activity recognition in unconstrained videos

1Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Combining audio and image processing for understanding video content has several benefits when compared to using each modality on their own. For the task of context and activity recognition in video sequences, it is important to explore both data streams to gather relevant information. In this paper we describe a video context and activity recognition model. Our work extracts a range of audio and visual features, followed by feature reduction and information fusion. We show that combining audio with video based decision making improves the quality of context and activity recognition in videos by 4% over audio data and 18% over image data. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Lopes, J., & Singh, S. (2006). Audio and video feature fusion for activity recognition in unconstrained videos. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4224 LNCS, pp. 823–831). Springer Verlag. https://doi.org/10.1007/11875581_99

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free