BEVStereo: Enhancing Depth Estimation in Multi-View 3D Object Detection with Temporal Stereo

Yinhao Li; Han Bao; Zheng Ge; Jinrong Yang; Jianjian Sun; Zeming Li

Conference ProceedingsOPEN ACCESS

BEVStereo: Enhancing Depth Estimation in Multi-View 3D Object Detection with Temporal Stereo

Li Y
Bao H
Ge Z
et al.

Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023 (2023) 37 1486-1494

DOI: 10.1609/aaai.v37i2.25234

161Citations

104Readers

Abstract

Restricted by the ability of depth perception, all Multi-view 3D object detection methods fall into the bottleneck of depth accuracy. By constructing temporal stereo, depth estimation is quite reliable in indoor scenarios. However, there are two difficulties in directly integrating temporal stereo into outdoor multi-view 3D object detectors: 1) The construction of temporal stereos for all views results in high computing costs. 2) Unable to adapt to challenging outdoor scenarios. In this study, we propose an effective method for creating temporal stereo by dynamically determining the center and range of the temporal stereo. The most confident center is found using the EM algorithm. Numerous experiments on nuScenes have shown the BEVStereo’s ability to deal with complex outdoor scenarios that other stereo-based methods are unable to handle. For the first time, a stereo-based approach shows superiority in scenarios like a static ego vehicle and moving objects. BEVStereo achieves the new state-of-the-art in the camera-only track of nuScenes dataset while maintaining memory efficiency. Codes have been released1. Given that monocular depth estimation has reached its limit and that time series input images are available in autonomous driving scenarios, it makes sense to use temporal stereo approaches to multi-view 3D object detection. However, if we incorporate the temporal stereo method into the multi-view 3D detector, there are two limitations:

Cite

CITATION STYLE

APA

Li, Y., Bao, H., Ge, Z., Yang, J., Sun, J., & Li, Z. (2023). BEVStereo: Enhancing Depth Estimation in Multi-View 3D Object Detection with Temporal Stereo. In Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023 (Vol. 37, pp. 1486–1494). AAAI Press. https://doi.org/10.1609/aaai.v37i2.25234

BEVStereo: Enhancing Depth Estimation in Multi-View 3D Object Detection with Temporal Stereo

Abstract

Cite

Register to see more suggestions