Recently, Multi-Target Multi-Camera Tracking (MTMCT) has gained more and more attention. It is a challenging task with major problems including occlusion, background clutter, poses and camera point of view variations. Compared to single camera tracking, which takes advantage of location information and strict time constraints, good appearance features are more important to MTMCT. This drives us to extract robust and discriminative features for MTMCT. We propose MTMCT_HS which uses human body part semantic features to overcome the above challenges. We use a two-stream deep neural network to extract the global appearance features and human body part semantic maps separately, and employ aggregation operations to generate final features. We argue that these features are more suitable for affinity measurement, which can be seen as the average of appearance similarity weighted by the corresponding human body part similarity. Next, our tracker adopts a hierarchical correlation clustering algorithm, which combines targets' appearance feature similarity with motion correlation for data association. We validate the effectiveness of our MTMCT_HS method by demonstrating its superiority over the state-of-the-art method on DukeMTMC benchmark. Experiments show that the extracted features with human body part semantics are more effective for MTMCT compared with the methods solely employing global appearance features.
CITATION STYLE
Wang, M., Yi, W., Shi, D., Zhang, T., Guan, N., & Fan, Z. (2019). Multi-target multi-camera tracking with human body part semantic features. In International Conference on Information and Knowledge Management, Proceedings (pp. 199–208). Association for Computing Machinery. https://doi.org/10.1145/3357384.3358029
Mendeley helps you to discover research relevant for your work.