This paper proposes a superpixel-based spatiotemporal saliency model for saliency detection in videos. Based on the superpixel representation of video frames, motion histograms and color histograms are extracted at the superpixel level as local features and frame level as global features. Then, superpixel-level temporal saliency is measured by integrating motion distinctiveness of superpixels with a scheme of temporal saliency prediction and adjustment, and superpixel-level spatial saliency is measured by evaluating global contrast and spatial sparsity of superpixels. Finally, a pixel-level saliency derivation method is used to generate pixel-level temporal and spatial saliency maps, and an adaptive fusion method is exploited to integrate them into the spatiotemporal saliency map. Experimental results on two public datasets demonstrate that the proposed model outperforms six state-of-the-art spatiotemporal saliency models in terms of both saliency detection and human fixation prediction.
CITATION STYLE
Liu, Z., Zhang, X., Luo, S., & Le Meur, O. (2014). Superpixel-based spatiotemporal saliency detection. IEEE Transactions on Circuits and Systems for Video Technology, 24(9), 1522–1540. https://doi.org/10.1109/TCSVT.2014.2308642
Mendeley helps you to discover research relevant for your work.