Cascaded Transposed Long-Range Convolutions for Monocular Depth Estimation

1Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We study the shape of the convolution kernels in the upsampling block for deep monocular depth estimation. First, our empirical analysis shows that the depth estimation accuracy can be improved consistently by only changing the shape of the two consecutive convolution layers with square kernels, e.g., (5 × 5 ) → (5 × 5 ), to two “long-range” kernels, one having the transposed shape of the other, e.g., (1 × 25 ) → (25 × 1 ). Second, based on this observation, we propose a new upsampling block called Cascaded Transposed Long-range Convolutions (CTLC) that uses parallel sequences of two long-range convolutions with different kernel shapes. Experiments with NYU Depth V2 and KITTI show that our CTLC offers higher accuracy with fewer parameters and FLOPs than state-of-the-art methods.

Cite

CITATION STYLE

APA

Irie, G., Ikami, D., Kawanishi, T., & Kashino, K. (2021). Cascaded Transposed Long-Range Convolutions for Monocular Depth Estimation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12624 LNCS, pp. 437–453). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-69535-4_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free