T2 Net: Synthetic-to-realistic translation for solving single-image depth estimation tasks

13Citations
Citations of this article
263Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Current methods for single-image depth estimation use training datasets with real image-depth pairs or stereo pairs, which are not easy to acquire. We propose a framework, trained on synthetic image-depth pairs and unpaired real images, that comprises an image translation network for enhancing realism of input images, followed by a depth prediction network. A key idea is having the first network act as a wide-spectrum input translator, taking in either synthetic or real images, and ideally producing minimally modified realistic images. This is done via a reconstruction loss when the training input is real, and GAN loss when synthetic, removing the need for heuristic self-regularization. The second network is trained on a task loss for synthetic image-depth pairs, with extra GAN loss to unify real and synthetic feature distributions. Importantly, the framework can be trained end-to-end, leading to good results, even surpassing early deep-learning methods that use real paired data.

Cite

CITATION STYLE

APA

Zheng, C., Cham, T. J., & Cai, J. (2018). T2 Net: Synthetic-to-realistic translation for solving single-image depth estimation tasks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11211 LNCS, pp. 798–814). Springer Verlag. https://doi.org/10.1007/978-3-030-01234-2_47

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free