Monocular 3D object detection via Mask-Revised Network and quality perception loss

Fengsui Wang; Yue Xu; Jingang Chen; Lei Xiong

Journal ArticleOPEN ACCESS

Monocular 3D object detection via Mask-Revised Network and quality perception loss

IET Computer Vision (2023) 17(2) 231-240

DOI: 10.1049/cvi2.12157

0Citations

7Readers

Abstract

The accuracy of the monocular 3D detection tasks based on the Pseudo-LiDAR method is improved greatly. However, the depth map obtained by depth estimation contains a lot of noise, which limits the detection accuracy. To address this problem, an efficient monocular 3D target detection method combined with a Mask-Revised Network and Quality Perception Loss is proposed. The method adaptively encodes the image into the mask and adjusts the feature weight from the region of interest by the visual attention mechanism. Then, the 2D bounding box confidence and 3D bounding box quality are used to calculate each confidence of the 3D bounding box to realize the quality perception of prediction results. The proposed algorithm is evaluated on the KITTI test datasets, and the results show that the proposed method achieves state-of-the-art performance on the task of monocular 3D target detection and outperforms the existing methods by about 2.75% on AP40 for Car categories.

Cite

CITATION STYLE

APA

Wang, F., Xu, Y., Chen, J., & Xiong, L. (2023). Monocular 3D object detection via Mask-Revised Network and quality perception loss. IET Computer Vision, 17(2), 231–240. https://doi.org/10.1049/cvi2.12157

Monocular 3D object detection via Mask-Revised Network and quality perception loss

Abstract

Cite

Register to see more suggestions