Beyond Key-Frames: The Physical Setting as a Video Mining Primitive

  • Aner-Wolf A
  • Kender J
N/ACitations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We present an automatic tool for the compact representation, cross-referencing, and exploration of long video sequences, which is based on a novel visual abstraction of semantic content. Our approach is based on building a highly compact hierarchical representation for long sequences. This is achieved by using non-temporal clustering of scene segments into a new conceptual form grounded in the recognition of real-world backgrounds. We represent shots and scenes using mosaics derived from representative shots, and employ a novel method for the comparison of scenes based on these representative mosaics. We then cluster scenes together into a more useful higher level of abstraction - the physical setting. We demonstrate our work using situation comedies, where each half-hour (40,000-frame) episode is well-structured by rules governing background use. Consequently, browsing, indexing, and comparison across videos by physical setting is very fast. Further, we show that the analysis of the frequency of use of these physical settings leads directly to high-level contextual identification of the main plots in each video. We demonstrate these contributions with a browsing tool which allows both temporal and non-temporal browsing of episodes from situation comedies

Cite

CITATION STYLE

APA

Aner-Wolf, A., & Kender, J. R. (2003). Beyond Key-Frames: The Physical Setting as a Video Mining Primitive. In Video Mining (pp. 31–60). Springer US. https://doi.org/10.1007/978-1-4757-6928-9_2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free