The understanding and analysis of video content are fundamentally important for numerous applications, including video summarization, retrieval, navigation, and editing. An important part of this process is to detect salient (which usually means important and interesting) objects in video segments. Unlike existing approaches, we propose a method that combines the saliency measurement with spatial and temporal coherence. The integration of spatial and temporal coherence is inspired by the focused attention in human vision. In the proposed method, the spatial coherence of low-level visual grouping cues (e. g. appearance and motion) helps per-frame object-background separation, while the temporal coherence of the object properties (e. g. shape and appearance) ensures consistent object localization over time, and thus the method is robust to unexpected environment changes and camera vibrations. Having developed an efficient optimization strategy based on coarse-to-fine multi-scale dynamic programming, we evaluate our method using a challenging dataset that is freely available together with this paper. We show the effectiveness and complementariness of the two types of coherence, and demonstrate that they can significantly improve the performance of salient object detection in videos. © 2011 The Author(s).
CITATION STYLE
Wu, Y., Zheng, N. N., Yuan, Z. J., Jiang, H. Z., & Liu, T. (2011). Detection of salient objects with focused attention based on spatial and temporal coherence. Chinese Science Bulletin, 56(10), 1055–1062. https://doi.org/10.1007/s11434-010-4387-1
Mendeley helps you to discover research relevant for your work.