Monocular 3D object detection via Mask-Revised Network and quality perception loss

0Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

The accuracy of the monocular 3D detection tasks based on the Pseudo-LiDAR method is improved greatly. However, the depth map obtained by depth estimation contains a lot of noise, which limits the detection accuracy. To address this problem, an efficient monocular 3D target detection method combined with a Mask-Revised Network and Quality Perception Loss is proposed. The method adaptively encodes the image into the mask and adjusts the feature weight from the region of interest by the visual attention mechanism. Then, the 2D bounding box confidence and 3D bounding box quality are used to calculate each confidence of the 3D bounding box to realize the quality perception of prediction results. The proposed algorithm is evaluated on the KITTI test datasets, and the results show that the proposed method achieves state-of-the-art performance on the task of monocular 3D target detection and outperforms the existing methods by about 2.75% on AP40 for Car categories.

Cite

CITATION STYLE

APA

Wang, F., Xu, Y., Chen, J., & Xiong, L. (2023). Monocular 3D object detection via Mask-Revised Network and quality perception loss. IET Computer Vision, 17(2), 231–240. https://doi.org/10.1049/cvi2.12157

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free