Self-supervised monocular depth and visual odometry learning with scale-consistent geometric constraints

22Citations
Citations of this article
20Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The self-supervised learning-based depth and visual odometry (VO) estimators trained on monocular videos without ground truth have drawn significant attention recently. Prior works use photometric consistency as supervision, which is fragile under complex realistic environments due to illumination variations. More importantly, it suffers from scale inconsistency in the depth and pose estimation results. In this paper, robust geometric losses are proposed to deal with this problem. Specifically, we first align the scales of two reconstructed depth maps estimated from the adjacent image frames, and then enforce forward-backward relative pose consistency to formulate scale-consistent geometric constraints. Finally, a novel training framework is constructed to implement the proposed losses. Extensive evaluations on KITTI and Make3D datasets demonstrate that, i) by incorporating the proposed constraints as supervision, the depth estimation model can achieve state-of-the-art (SOTA) performance among the self-supervised methods, and ii) it is effective to use the proposed training framework to obtain a uniform global scale VO model.

Cite

CITATION STYLE

APA

Xiong, M., Zhang, Z., Zhong, W., Ji, J., Liu, J., & Xiong, H. (2020). Self-supervised monocular depth and visual odometry learning with scale-consistent geometric constraints. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2021-January, pp. 963–969). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2020/134

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free