Co-saliency spatio-temporal interaction network for person re-identification in videos

8Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

Abstract

Person re-identification aims at identifying a certain pedestrian across non-overlapping camera networks. Video-based person re-identification approaches have gained significant attention recently, expanding image-based approaches by learning features from multiple frames. In this work, we propose a novel Co-Saliency Spatio-Temporal Interaction Network (CSTNet) for person re-identification in videos. It captures the common salient foreground regions among video frames and explores the spatial-temporal long-range context interdependency from such regions, towards learning discriminative pedestrian representation. Specifically, multiple co-saliency learning modules within CSTNet are designed to utilize the correlated information across video frames to extract the salient features from the task-relevant regions and suppress background interference. Moreover, multiple spatial-temporal interaction modules within CSTNet are proposed, which exploit the spatial and temporal long-range context interdependencies on such features and spatial-temporal information correlation, to enhance feature representation. Extensive experiments on two benchmarks have demonstrated the effectiveness of the proposed method.

Cite

CITATION STYLE

APA

Liu, J., Zha, Z. J., Zhu, X., & Jiang, N. (2020). Co-saliency spatio-temporal interaction network for person re-identification in videos. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2021-January, pp. 1012–1018). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2020/141

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free