Scale-Aware Visual-Inertial Depth Estimation and Odometry Using Monocular Self-Supervised Learning

Chungkeun Lee; Changhyeon Kim; Pyojin Kim; Hyeonbeom Lee; H. Jin Kim

Journal ArticleOPEN ACCESS

Scale-Aware Visual-Inertial Depth Estimation and Odometry Using Monocular Self-Supervised Learning

IEEE Access (2023) 11 24087-24102

DOI: 10.1109/ACCESS.2023.3252884

6Citations

10Readers

Abstract

For real-world applications with a single monocular camera, scale ambiguity is an important issue. Because self-supervised data-driven approaches that do not require additional data containing scale information cannot avoid the scale ambiguity, state-of-the-art deep-learning-based methods address this issue by learning the scale information from additional sensor measurements. In that regard, inertial measurement unit (IMU) is a popular sensor for various mobile platforms due to its lightweight and inexpensiveness. However, unlike supervised learning that can learn the scale from the ground-truth information, learning the scale from IMU is challenging in a self-supervised setting. We propose a scale-aware monocular visual-inertial depth estimation and odometry method with end-to-end training. To learn the scale from the IMU measurements with end-to-end training in the monocular self-supervised setup, we propose a new loss function named as preintegration loss function, which trains scale-aware ego-motion by comparing the ego-motion integrated from IMU measurement and predicted ego-motion. Since the gravity and the bias should be compensated to obtain the ego-motion by integrating IMU measurements, we design a network to predict the gravity and the bias in addition to the ego-motion and the depth map. The overall performance of the proposed method is compared to state-of-the-art methods in the popular outdoor driving dataset, i.e., KITTI dataset, and the author-collected indoor driving dataset. In the KITTI dataset, the proposed method shows competitive performance compared with state-of-the-art monocular depth estimation and odometry methods, i.e., root-mean-square error of 5.435 m in the KITTI Eigen split and absolute trajectory error of 22.46 m and 0.2975 degrees in the KITTI odometry 09 sequence. Different from other up-to-scale monocular methods, the proposed method can estimate the metric-scaled depth and camera poses. Additional experiments on the author-collected indoor driving dataset qualitatively confirm the accurate performance of metric-depth and metric pose estimations.

Author supplied keywords

Cite

CITATION STYLE

APA

Lee, C., Kim, C., Kim, P., Lee, H., & Jin Kim, H. (2023). Scale-Aware Visual-Inertial Depth Estimation and Odometry Using Monocular Self-Supervised Learning. IEEE Access, 11, 24087–24102. https://doi.org/10.1109/ACCESS.2023.3252884

Scale-Aware Visual-Inertial Depth Estimation and Odometry Using Monocular Self-Supervised Learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions