Video-based group re-identification (Re-ID) remains to be a meaningful task under rare study. Group Re-ID contains the information of the relationship between pedestrians, while the video sequences provide more frames to identify the person. In this paper, we propose a spatial-temporal fusion network for the group Re-ID. The network composes of the residual learning played between the CNN and the RNN in a unified network, and the attention mechanism which makes the system focus on the discriminative features. We also propose a new group Re-ID dataset DukeGroupVid to evaluate the performance of our spatialtemporal fusion network. Comprehensive experimental results on the proposed dataset and other video-based datasets, PRID-2011, i-LIDS-VID and MARS, demonstrate the effectiveness of our model.
CITATION STYLE
Xu, Q., Yang, H., & Chen, L. (2019). Spatial-temporal fusion network with residual learning and attention mechanism: A benchmark for video-based group Re-ID. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11857 LNCS, pp. 492–504). Springer. https://doi.org/10.1007/978-3-030-31654-9_42
Mendeley helps you to discover research relevant for your work.