Audio and video feature fusion for activity recognition in unconstrained videos

José Lopes; Sameer Singh

Conference Proceedings

Audio and video feature fusion for activity recognition in unconstrained videos

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 4224 LNCS 823-831

DOI: 10.1007/11875581_99

1Citations

8Readers

Get full text

Abstract

Combining audio and image processing for understanding video content has several benefits when compared to using each modality on their own. For the task of context and activity recognition in video sequences, it is important to explore both data streams to gather relevant information. In this paper we describe a video context and activity recognition model. Our work extracts a range of audio and visual features, followed by feature reduction and information fusion. We show that combining audio with video based decision making improves the quality of context and activity recognition in videos by 4% over audio data and 18% over image data. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Lopes, J., & Singh, S. (2006). Audio and video feature fusion for activity recognition in unconstrained videos. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4224 LNCS, pp. 823–831). Springer Verlag. https://doi.org/10.1007/11875581_99

Audio and video feature fusion for activity recognition in unconstrained videos

Abstract

Cite

Register to see more suggestions