Unsupervised learning of depth from monocular videos using 3D-2D corresponding constraints

4Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.

Abstract

Depth estimation can provide tremendous help for object detection, localization, path planning, etc. However, the existing methods based on deep learning have high requirements on computing power and often cannot be directly applied to autonomous moving platforms (AMP). Fifth-generation (5G) mobile and wireless communication systems have attracted the attention of researchers because it provides the network foundation for cloud computing and edge computing, which makes it possible to utilize deep learning method on AMP. This paper proposes a depth prediction method for AMP based on unsupervised learning, which can learn from video sequences and simultaneously estimate the depth structure of the scene and the ego-motion. Compared with the existing unsupervised learning methods, our method makes the spatial correspondence among pixel points consistent with the image area by smoothing the 3D corresponding vector field based on 2D image, which effectively improves the depth prediction ability of the neural network. Our experiments on the KITTI driving dataset demonstrated that our method outperformed other previous learning-based methods. The results on the Apolloscape and Cityscapes datasets show that our proposed method has a strong universality.

Cite

CITATION STYLE

APA

Jin, F., Zhao, Y., Wan, C., Yuan, Y., & Wang, S. (2021). Unsupervised learning of depth from monocular videos using 3D-2D corresponding constraints. Remote Sensing, 13(9). https://doi.org/10.3390/rs13091764

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free