LiDAR-Camera-Based Deep Dense Fusion for Robust 3D Object Detection

6Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

For the camera-LiDAR-based three-dimensional (3D) object detection, image features have rich texture descriptions and LiDAR features possess objects’ 3D information. To fully fuse view-specific feature maps, this paper aims to explore the two-directional fusion of arbitrary size camera feature maps and LiDAR feature maps in the early feature extraction stage. Towards this target, a deep dense fusion 3D object detection framework is proposed for autonomous driving. This is a two stage end-to-end learnable architecture, which takes 2D images and raw LiDAR point clouds as inputs and fully fuses view-specific features to achieve high-precision oriented 3D detection. To fuse the arbitrary-size features from different views, a multi-view resizes layer (MVRL) is born. Massive experiments evaluated on the KITTI benchmark suite show that the proposed approach outperforms most state-of-the-art multi-sensor-based methods on all three classes on moderate difficulty (3D/BEV): Car (75.60%/88.65%), Pedestrian (64.36%/66.98%), Cyclist (57.53%/57.30%). Specifically, the DDF3D greatly improves the detection accuracy of hard difficulty in 2D detection with an 88.19% accuracy for the car class.

Cite

CITATION STYLE

APA

Wen, L., & Jo, K. H. (2020). LiDAR-Camera-Based Deep Dense Fusion for Robust 3D Object Detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12465 LNAI, pp. 133–144). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-60796-8_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free