Gaussian dilated convolution for semantic image segmentation

1Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In semantic image segmentation, multi scale contextual information is collected by probing the features with dilated large convolution filters or spatial pooling operations. Such enlargement of the receptive field promotes a more stable and global consistence segmentation prediction. Dilated convolution can be treated as the combination of a sampling process and a common convolution. For example, a 3 × 3 convolution with a large dilation rate picks 9 positions in a very large window. In this paper we propose a more rational way to sample features from a very large receptive field. Specifically Gaussian kernels are used to accumulate features in each position to produce a more stable representation. We also delve into the difference of up-sampling logits and down-sampling ground truth and provide a theoretical explanation. We demonstrate the effectiveness of Gaussian dilated convolution on the semantic image segmentation datasets of Pascal VOC 2012, Cityscapes and ADE20k. Gaussian dilated convolution performs consistently superior to dilated convolution throughout our experiments, which verifies the effectiveness of this method. Code will be released for reproduction.

Cite

CITATION STYLE

APA

Shen, F., & Zeng, G. (2018). Gaussian dilated convolution for semantic image segmentation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11164 LNCS, pp. 324–334). Springer Verlag. https://doi.org/10.1007/978-3-030-00776-8_30

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free