Deep Multi-view Stereo for Dense 3D Reconstruction from Monocular Endoscopic Video

10Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.
Get full text

Abstract

3D reconstruction from monocular endoscopic images is a challenging task. State-of-the-art multi-view stereo (MVS) algorithms based on image patch similarity often fail to obtain a dense reconstruction from weakly-textured endoscopic images. In this paper, we present a novel deep-learning-based MVS algorithm that can produce a dense and accurate 3D reconstruction from a monocular endoscopic image sequence. Our method consists of three key steps. Firstly, a number of depth candidates are sampled around the depth prediction made by a pre-trained CNN. Secondly, each candidate is projected to the other images in the sequence, and the matching score is measured using a patch embedding network that maps each image patch into a compact embedding. Finally, the candidate with the highest score is selected for each pixel. Experiments on colonoscopy videos demonstrate that our patch embedding network outperforms zero-normalized cross-correlation and a state-of-the-art stereo matching network in terms of matching accuracy and that our MVS algorithm produces several degrees of magnitude denser reconstruction than the competing methods when same accuracy filtering is applied.

Cite

CITATION STYLE

APA

Bae, G., Budvytis, I., Yeung, C. K., & Cipolla, R. (2020). Deep Multi-view Stereo for Dense 3D Reconstruction from Monocular Endoscopic Video. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12263 LNCS, pp. 774–783). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-59716-0_74

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free