In this paper we present a simple yet effective approach to extend without supervision any object proposal from static images to videos. Unlike previous methods, these spatio-temporal proposals, to which we refer as “tracks”, are generated relying on little or no visual content by only exploiting bounding boxes spatial correlations through time. The tracks that we obtain are likely to represent objects and are a general-purpose tool to represent meaningful video content for a wide variety of tasks. For unannotated videos, tracks can be used to discover content without any supervision. As further contribution we also propose a novel and dataset-independent method to evaluate a generic object proposal based on the entropy of a classifier output response. We experiment on two competitive datasets, namely YouTube Objects [6] and ILSVRC- 2015 VID [7].
CITATION STYLE
Cuffaro, G., Becattini, F., Baecchi, C., Seidenari, L., & Del Bimbo, A. (2016). Segmentation free object discovery in video. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9915 LNCS, pp. 25–31). Springer Verlag. https://doi.org/10.1007/978-3-319-49409-8_4
Mendeley helps you to discover research relevant for your work.