Detection of documentary scene changes by audio-visual fusion

6Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The concept of a documentary scene was inferred from the audio-visual characteristics of certain documentary videos. It was observed that the amount of information from the visual component alone was not enough to convey a semantic context to most portions of these videos, but a joint observation of the visual component and the audio component conveyed a better semantic context. From the observations that we made on the video data, we generated an audio score and a visual score. We later generated a weighted audio-visual score within an interval and adaptively expanded or shrunk this interval until we found a local maximum score value. The video ultimately will be divided into a set of intervals that correspond to the documentary scenes in the video. After we obtained a set of documentary scenes, we made a check for any redundant detections. © Springer-Verlag Berlin Heidelberg 2003.

Cite

CITATION STYLE

APA

Velivelli, A., Ngo, C. W., & Huang, T. S. (2003). Detection of documentary scene changes by audio-visual fusion. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2728, 227–237. https://doi.org/10.1007/3-540-45113-7_23

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free