Depth estimation from a monocular image is of paramount importance in various vision tasks, such as obstacle detection, robot navigation, and 3D reconstruction. However, how to get an accurate depth map with clear details and a fine resolution remains an unresolved issue. As an attempt to solve this problem, we exploit image super-resolution concepts and techniques for monocular depth estimation and propose a novel CNN-based approach, namely MSCN NS , which involves multi-scale sub-pixel convolutions and a neighborhood smoothness constraint. Specifically, MSCN NS makes use of sub-pixel convolutions with multi-scale fusions to retrieve a high-resolution depth map with fine details of the scene. Different from previous multi-scale fusion strategies, those multi-scale features come from supervised scale branches of the network. Furthermore, MSCN NS incorporates a neighborhood smoothness regularization term to make sure that spatially closer pixels with similar features would have close depth values. The effectiveness and efficiency of MSCN NS have been corroborated through extensive experiments conducted on benchmark datasets.
CITATION STYLE
Zhao, S., Zhang, L., Shen, Y., Zhao, S., & Zhang, H. (2019). Super-Resolution for Monocular Depth Estimation with Multi-Scale Sub-Pixel Convolutions and a Smoothness Constraint. IEEE Access, 7, 16323–16335. https://doi.org/10.1109/ACCESS.2019.2894651
Mendeley helps you to discover research relevant for your work.