Multi-fusion with attention mechanism for 3D object detection

2Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

Artificial intelligence gradually plays the essential role in automatic driving, such as 3d object detection. Many state-of-the-art 3d detection frameworks fuse point cloud data and image data to perceive the surrounding environment of the vehicle. However, these approaches focus more on vehicle detections, and for objects with less point cloud sampling, such as pedestrians and cyclists, the performance is moderate. In this paper, we propose the multi-fusion framework with two kinds of attention mechanisms to solve the above problem and improve the detection accuracy of 3d objects. The proposed 3d attention mechanism with voxel sparse information is utilized in the framework. This framework contains two important modules: point fusion with 2d attention and voxel fusion with 3d attention. These modules firstly obtain the image features by projecting the lidar point or 8 vertices of the voxel to image feature maps. Then, these modules perform attentive fusion on the voxelized image features, point-wise image features and lidar data. Our evaluation on the challenging KITTI dataset, including 3d and bird's eye view metrics, demonstrates great improvements, especially at objects with less point cloud sampling.

Cite

CITATION STYLE

APA

Wang, N., & Sun, P. (2021). Multi-fusion with attention mechanism for 3D object detection. In Proceedings of the International Conference on Software Engineering and Knowledge Engineering, SEKE (Vol. 2021-July, pp. 475–480). Knowledge Systems Institute Graduate School. https://doi.org/10.18293/SEKE2021-115

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free