Multi-scale spatio-temporal feature extraction and depth estimation from sequences by ordinal classification

3Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

Depth estimation is a key problem in 3D computer vision and has a wide variety of applications. In this paper we explore whether deep learning network can predict depth map accurately by learning multi-scale spatio-temporal features from sequences and recasting the depth estimation from a regression task to an ordinal classification task. We design an encoder-decoder network with several multi-scale strategies to improve its performance and extract spatio-temporal features with ConvLSTM. The results of our experiments show that the proposed method has an improvement of almost 10% in error metrics and up to 2% in accuracy metrics. The results also tell us that extracting spatio-temporal features can dramatically improve the performance in depth estimation task. We consider to extend this work to a self-supervised manner to get rid of the dependence on large-scale labeled data.

Cite

CITATION STYLE

APA

Liu, Y. (2020). Multi-scale spatio-temporal feature extraction and depth estimation from sequences by ordinal classification. Sensors (Switzerland), 20(7). https://doi.org/10.3390/s20071979

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free