Tracking Objects as Pixel-Wise Distributions

10Citations
Citations of this article
45Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Multi-object tracking (MOT) requires detecting and associating objects through frames. Unlike tracking via detected bounding boxes or center points, we propose tracking objects as pixel-wise distributions. We instantiate this idea on a transformer-based architecture named P3AFormer, with pixel-wise propagation, prediction, and association. P3AFormer propagates pixel-wise features guided by flow information to pass messages between frames. Further, P3AFormer adopts a meta-architecture to produce multi-scale object feature maps. During inference, a pixel-wise association procedure is proposed to recover object connections through frames based on the pixel-wise prediction. P3AFormer yields 81.2% in terms of MOTA on the MOT17 benchmark – highest among all transformer networks to reach 80% MOTA in literature. P3AFormer also outperforms state-of-the-arts on the MOT20 and KITTI benchmarks. The code is at https://github.com/dvlab-research/ ECCV22-P3AFormer-Tracking-Objects-as-Pixel-wise-Distributions.

Cite

CITATION STYLE

APA

Zhao, Z., Wu, Z., Zhuang, Y., Li, B., & Jia, J. (2022). Tracking Objects as Pixel-Wise Distributions. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13682 LNCS, pp. 76–94). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-20047-2_5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free