Existing outdoor three-dimensional (3D) object detection algorithms mainly use a single type of sensor, for example, only using a monocular camera or radar point cloud. However, camera sensors are affected by light and lose depth information. When scanning a distant object or an occluded object, the data collected by the short-range radar point cloud sensor are very sparse, which affects the detection algorithm. To address the above challenges, we design a deep learning network that can combine the texture information of two-dimensional (2D) data and the geometric information of 3D data for object detection. To solve the problem of a single sensor, we use a reverse mapping layer and an aggregation layer to combine the texture information of RGB data with the geometric information of point cloud data and design a maximum pooling layer to deal with the input of multi-view cameras. In addition, to solve the defects of the 3D object detection algorithm based on the region proposal network (RPN) method, we use the Hough voting algorithm implemented by a deep neural network to suggest objects. Experimental results show that our algorithm has a 1.06% decrease in average precision (AP) compared to PointRCNN in easy car object detection, but our algorithm requires 37.7% less time to calculate than PointRCNN under the same hardware environment. Moreover, our algorithm improves the AP by 1.14% compared to PointRCNN in hard car object detection.
CITATION STYLE
Yan, M., Li, Z., Yu, X., & Jin, C. (2020). An End-to-End Deep Learning Network for 3D Object Detection from RGB-D Data Based on Hough Voting. IEEE Access, 8, 138810–138822. https://doi.org/10.1109/ACCESS.2020.3012695
Mendeley helps you to discover research relevant for your work.