PPEDNet: Pyramid pooling encoder-decoder network for real-time semantic segmentation

4Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Image semantic segmentation is a fundamental problem and plays an important role in computer vision and artificial intelligence. Recent deep neural networks have improved the accuracy of semantic segmentation significantly. Meanwhile, the number of network parameters and floating point operations have also increased notably. The real-world applications not only have high requirements on the segmentation accuracy, but also demand real-time processing. In this paper, we propose a pyramid pooling encoder-decoder network named PPEDNet for both better accuracy and faster processing speed. Our encoder network is based on VGG16 and discards the fully connected layers due to their huge amounts of parameters. To extract context feature efficiently, we design a pyramid pooling architecture. The decoder is a trainable convolutional network for upsampling the output of the encoder, and fine-tuning the segmentation details. Our method is evaluated on CamVid dataset, achieving 7.214% mIOU accuracy improvement while reducing 17.9% of the parameters compared with the state-of-the-art algorithm.

Cite

CITATION STYLE

APA

Tan, Z., Liu, B., & Yu, N. (2017). PPEDNet: Pyramid pooling encoder-decoder network for real-time semantic segmentation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10666 LNCS, pp. 328–339). Springer Verlag. https://doi.org/10.1007/978-3-319-71607-7_29

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free