PETR: Position Embedding Transformation for Multi-view 3D Object Detection

Yingfei Liu; Tiancai Wang; Xiangyu Zhang; Jian Sun

Conference Proceedings

PETR: Position Embedding Transformation for Multi-view 3D Object Detection

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2022) 13687 LNCS 531-548

DOI: 10.1007/978-3-031-19812-0_31

54Citations

309Readers

Get full text

Abstract

In this paper, we develop position embedding transformation (PETR) for multi-view 3D object detection. PETR encodes the position information of 3D coordinates into image features, producing the 3D position-aware features. Object query can perceive the 3D position-aware features and perform end-to-end object detection. PETR achieves state-of-the-art performance (50.4% NDS and 44.1% mAP) on standard nuScenes dataset and ranks 1st place on the benchmark. It can serve as a simple yet strong baseline for future research. Code is available at https://github.com/megvii-research/PETR.

Author supplied keywords

Cite

CITATION STYLE

APA

Liu, Y., Wang, T., Zhang, X., & Sun, J. (2022). PETR: Position Embedding Transformation for Multi-view 3D Object Detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13687 LNCS, pp. 531–548). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-19812-0_31

PETR: Position Embedding Transformation for Multi-view 3D Object Detection

Abstract

Author supplied keywords

Cite

Register to see more suggestions