Image segmentation using encoder-decoder with deformable convolutions

18Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.

Abstract

Image segmentation is an essential step in image analysis that brings meaning to the pixels in the image. Nevertheless, it is also a difficult task due to the lack of a general suited approach to this problem and the use of real-life pictures that can suffer from noise or object obstruction. This paper proposes an architecture for semantic segmentation using a convolutional neural network based on the Xception model, which was previously used for classification. Different experiments were made in order to find the best performances of the model (eg. different resolution and depth of the network and data augmentation techniques were applied). Additionally, the network was improved by adding a deformable convolution module. The proposed architecture obtained a 76.8 mean IoU on the Pascal VOC 2012 dataset and 58.1 on the Cityscapes dataset. It outperforms SegNet and U-Net networks, both networks having considerably more parameters and also a higher inference time.

Cite

CITATION STYLE

APA

Gurita, A., & Mocanu, I. G. (2021). Image segmentation using encoder-decoder with deformable convolutions. Sensors, 21(5), 1–27. https://doi.org/10.3390/s21051570

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free