This work addresses the task of multi-person tracking in crowded street scenes, where long-term occlusions pose a major challenge. One popular way to address this challenge is to re-identify people before and after occlusions using Convolutional Neural Networks (CNNs). To achieve good performance, CNNs require a large amount of training data, which is not available for multi-person tracking scenarios. Instead of annotating large training sequences, we introduce a customized multi-person tracker that automatically adapts its person re-identification CNNs to capture the discriminative appearance patterns in a test sequence. We show that a few high-quality training examples that are automatically mined from the test sequence can be used to fine-tune pre-trained CNNs, thereby teaching them to recognize the uniqueness of people’s appearance in the test sequence. To that end, we introduce a hierarchical correlation clustering (HCC) framework, in which we utilize an existing robust correlation clustering tracking model, but with different graph structures to generate local, reliable tracklets as well as globally associated tracks. We deploy intuitive physical constraints on the local tracklets to generate the high-quality training examples for customizing the person re-identification CNNs. Our customized multi-person tracker achieves state-of-the-art performance on the challenging MOT16 tracking benchmark.
CITATION STYLE
Ma, L., Tang, S., Black, M. J., & Van Gool, L. (2019). Customized Multi-person Tracker. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11362 LNCS, pp. 612–628). Springer Verlag. https://doi.org/10.1007/978-3-030-20890-5_39
Mendeley helps you to discover research relevant for your work.