Polyp2Seg: Improved Polyp Segmentation with Vision Transformer

Vittorino Mandujano-Cornejo; Javier A. Montoya-Zegarra

Conference Proceedings

Polyp2Seg: Improved Polyp Segmentation with Vision Transformer

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2022) 13413 LNCS 519-534

DOI: 10.1007/978-3-031-12053-4_39

4Citations

1Readers

Get full text

Abstract

Colorectal cancer (CRC) is the third most common type of cancer worldwide. It can be prevented by screening the colon and detecting polyps which might become malign. Therefore, an accurate detection/segmentation of polyps in colonoscopy images is crucial for CRC prevention. In this paper, we propose a novel transformer-based architecture for polyp image segmentation named Polyp2Seg. The model adopts a transformer architecture as its encoder to extract multi-hierarchical features. Additionally, a novel Feature Aggregation Module (FAM) merges progressively the multi-level features from the encoder to better localise polyps by adding semantic information. Next, a Multi-Context Attention Module (MCAM) removes noise and other artifacts, while incorporating a multi-scale attention mechanism to further improve polyp detections. Quantitative and qualitative experiments on five challenging datasets and over 5 different SOTAs demonstrate that our method significantly improves the segmentation accuracy of Polyps under different evaluation metrics. Our model achieves a new state-of-the-art over most of the datasets.

Author supplied keywords

Cite

CITATION STYLE

APA

Mandujano-Cornejo, V., & Montoya-Zegarra, J. A. (2022). Polyp2Seg: Improved Polyp Segmentation with Vision Transformer. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13413 LNCS, pp. 519–534). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-12053-4_39

Polyp2Seg: Improved Polyp Segmentation with Vision Transformer

Abstract

Author supplied keywords

Cite

Register to see more suggestions