Cross-Media Body-Part Attention Network for Image-to-Video Person Re-Identification

3Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Person re-identification (re-id) is a significant application in public security and attracts much more research interest due to its significant application in reality. Most person re-id models focus on image-based or video-based re-id problems. In fact, image-to-video person re-id has important significance in lost-human location, criminal-tracking, and pedestrian video retrieval. In image-to-video person re-id task, the key challenge of this issue is how to build an accurate connection between appearance image features and spatio-temporal video features due to the huge cross-media gap in different modalities. Although existing image-to-video person re-id models have achieved good effectiveness, there is still a large distance away from practical application. These methods only consider the similarity measurement of cross-media features, which are extracted from the original whole image/video without any importance. However, the main useful and discriminative information is always contained in human body parts (torso, elbow, wrist, knee, and ankle), while pedestrian image/video backgrounds retain lots of useless information. In this paper, we present a Cross-media Body-part Attention Network (CBAN) for image-to-video person re-id, which can extract the cross-media body part attention features from images/videos (by CNN/LSTM), and simultaneously ignore the useless information in the background by using a part attention mechanism. Besides, our network can alleviate the inherent cross-media gap by a novel media-pulling constraint term. The extensive experiments are conducted on three large scale datasets (Market1501, Mars and CUHK03) and two small datasets (PRID-2011, iLIDS-VID), and the results show our CBAN approach can solve the image-to-video person re-id problem effectively with a body-part attention mechanism.

Cite

CITATION STYLE

APA

Yu, B., Xu, N., & Zhou, J. (2019). Cross-Media Body-Part Attention Network for Image-to-Video Person Re-Identification. IEEE Access, 7, 94966–94976. https://doi.org/10.1109/ACCESS.2019.2928337

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free