Efficient visual content retrieval and mining in videos

Josef Sivic; Andrew Zisserman

Journal Article

Efficient visual content retrieval and mining in videos

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2004) 3332 471-478

DOI: 10.1007/978-3-540-30542-2_58

5Citations

31Readers

Get full text

Abstract

We describe an image representation for objects and scenes consisting of a configuration of viewpoint covenant regions and their descriptors. This representation enables recognition to proceed successfully despite changes in scale, viewpoint, illumination and partial occlusion. Vector quantization of these descriptors then enables efficient matching on the scale of an entire feature film. We show two applications. The first is to efficient object retrieval where the technology of text retrieval, such as inverted file systems, can be employed at run time to return all shots containing the object in a manner, and with a speed, similar to a Google search for text. The object is specified by a user outlining it in an image, and the object is then delineated in the retrieved shots. The second application is to data mining. We obtain the principal objects, characters and scenes in a video by measuring the reoccurrence of these spatial configurations of viewpoint covenant regions. The applications are illustrated on two full length feature films. © Springer-Verlag Berlin Heidelberg 2004.

Cite

CITATION STYLE

APA

Sivic, J., & Zisserman, A. (2004). Efficient visual content retrieval and mining in videos. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3332, 471–478. https://doi.org/10.1007/978-3-540-30542-2_58

Efficient visual content retrieval and mining in videos

Abstract

Cite

Register to see more suggestions