PETR: Position Embedding Transformation for Multi-view 3D Object Detection

54Citations
Citations of this article
309Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we develop position embedding transformation (PETR) for multi-view 3D object detection. PETR encodes the position information of 3D coordinates into image features, producing the 3D position-aware features. Object query can perceive the 3D position-aware features and perform end-to-end object detection. PETR achieves state-of-the-art performance (50.4% NDS and 44.1% mAP) on standard nuScenes dataset and ranks 1st place on the benchmark. It can serve as a simple yet strong baseline for future research. Code is available at https://github.com/megvii-research/PETR.

Cite

CITATION STYLE

APA

Liu, Y., Wang, T., Zhang, X., & Sun, J. (2022). PETR: Position Embedding Transformation for Multi-view 3D Object Detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13687 LNCS, pp. 531–548). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-19812-0_31

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free