A Review of Colorectal Polyp Segmentation Methods
DOI:
https://doi.org/10.54097/pqj8hf46Keywords:
Colorectal polyp, Deep learning, Segmentation method, Learning paradigm, DatasetAbstract
Colorectal cancer is a malignant tumor with a persistently high incidence worldwide, and the majority of cases arise from the malignant transformation of polyps. However, polyps in colonoscopic images exhibit significant variations in morphology and scale, and the boundaries between polyps and surrounding tissues are often indistinct, which makes manual detection particularly challenging. On the one hand, the diagnostic process relies heavily on physicians’ experience, which may lead to missed diagnoses and misdiagnoses. On the other hand, the limited number of experienced clinicians makes it difficult to meet patients’ needs in a timely manner. This diagnostic uncertainty, combined with the scarcity of medical resources, highlights the critical value of AI-assisted diagnostic systems. Developing high-performance polyp segmentation algorithms that accurately locate and delineate polyp regions can improve diagnostic efficiency and provide effective decision support for clinicians. Recent advances in deep learning have significantly improved colorectal polyp segmentation. Regarding network architectures, models have evolved from early U-shaped designs to Transformer-based architectures that capture global context. More recently, Mamba-like networks and dual-branch architectures have also been proposed. In terms of research paradigms, this paper focuses on the application of methods such as multi-scale feature fusion, attention mechanisms, and diffusion models. In addition, this study summarizes commonly used datasets in the field of polyp segmentation and provides a detailed description of evaluation metrics closely related to segmentation performance. Finally, this paper systematically analyzes the limitations of current colorectal polyp segmentation methods in clinical practice. Based on this analysis, it discusses potential directions and trends for future algorithm development, providing a reference for further research.
Downloads
References
[1] J. Silva, A. Histace et al., "Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer," vol. 9, no. 2, pp. 283-293, 2014.
[2] X. Kang, Z. Ma et al., "Multi-scale information sharing and selection network with boundary attention for polyp segmentation," vol. 139, p. 109467, 2025.
[3] Y. Liu, C. Zhu et al., "Temporal trends in disability adjusted life year and mortality for colorectal cancer attributable to a high red meat diet in China from 1990 to 2021: an analysis of the global burden of disease study 2021," BMC Gastroenterology, vol. 24, no. 1, p. 476, 2024/12/27 2024, https://doi.org/10.1186/s12876-024-03563-7.
[4] O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," in International Conference on Medical image computing and computer-assisted intervention, 2015, pp. 234-241: Springer.
[5] D. Jha, P. H. Smedsrud et al., "Resunet++: An advanced architecture for medical image segmentation," in 2019 IEEE international symposium on multimedia (ISM), 2019, pp. 225-2255: IEEE.
[6] Z. Zhou, M. M. Rahman Siddiquee et al., "Unet++: A nested u-net architecture for medical image segmentation," in International workshop on deep learning in medical image analysis, 2018, pp. 3-11: Springer.
[7] D.-P. Fan, G.-P. Ji et al., "Pranet: Parallel reverse attention network for polyp segmentation," in International conference on medical image computing and computer-assisted intervention, 2020, pp. 263-273: Springer.
[8] R. Tang, H. Zhao et al., "A frequency attention-embedded network for polyp segmentation," vol. 15, no. 1, p. 4961, 2025.
[9] T. Zhou, Y. Zhou et al., "Cross-level feature aggregation network for polyp segmentation," vol. 140, p. 109555, 2023.
[10] R. Zhang, P. Lai et al., "Lesion-aware dynamic kernel for polyp segmentation," in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2022, pp. 99-109: Springer.
[11] A. Vaswani, N. Shazeer et al., "Attention is all you need," vol. 30, 2017.
[12] A. Dosovitskiy, L. Beyer et al., "An image is worth 16x16 words: Transformers for image recognition at scale," 2020.
[13] Z. Liu, Y. Lin et al., "Swin transformer: Hierarchical vision transformer using shifted windows," in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 10012-10022.
[14] J. Chen, Y. Lu et al., "Transunet: Transformers make strong encoders for medical image segmentation," 2021.
[15] Y. Zhang, H. Liu, and Q. Hu, "Transfuse: Fusing transformers and cnns for medical image segmentation," in International conference on medical image computing and computer-assisted intervention, 2021, pp. 14-24: Springer.
[16] A. Gu and T. J. a. p. a. Dao, "Mamba: Linear-time sequence modeling with selective state spaces," 2023.
[17] J. Ruan, J. Li, and S. Xiang, "Vm-unet: Vision mamba unet for medical image segmentation," 2024, https://doi.org/10.48550/arXiv.2402.02491.
[18] X. Zhu, W. Wang et al., "Polyp-mamba: A hybrid multi-frequency perception gated selection network for polyp segmentation," vol. 115, p. 102759, 2025.
[19] J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431-3440, https://doi.org/10.1109/CVPR.2015.7298965.
[20] O. Oktay, J. Schlemper et al., "Attention u-net: Learning where to look for the pancreas," 2018, https://doi.org/10.48550/arXiv.1804.03999.
[21] V. Badrinarayanan, A. Kendall, and R. Cipolla, "SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation," (in English), Ieee Transactions on Pattern Analysis and Machine Intelligence, Article vol. 39, no. 12, pp. 2481-2495, Dec 2017, https://doi.org/10.1109/tpami.2016.2644615.
[22] T. Hussain, H. Shouno et al., "DCSSGA-UNet: Biomedical image segmentation with DenseNet channel spatial and semantic guidance attention," vol. 314, p. 113233, 2025.
[23] G. Huang, Z. Liu et al., "Densely Connected Convolutional Networks," arXiv [cs.CV], 2018 2018.
[24] Z. Xu, F. Tang et al., "Polyp-mamba: Polyp segmentation with visual mamba," in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2024, pp. 510-521: Springer.
[25] H. Yan, Q. Hong et al., "SCM-UNet: Spatial-channel Mamba UNet for medical image segmentation," Digital Signal Processing, vol. 168, p. 105550, 2026/01/01/ 2026, https://doi.org/https://doi.org/10.1016/j.dsp.2025.105550.
[26] T. K. Dutta, S. Majhi et al., "SAM-Mamba: Mamba Guided SAM Architecture for Generalized Zero-Shot Polyp Segmentation," arXiv [cs.CV], 2024 2024.
[27] C. Yu, J. Wang et al., "Bisenet: Bilateral segmentation network for real-time semantic segmentation," in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 325-341.
[28] M. Zhang, Z. Chen et al., "HMT-UNet: A hybird Mamba-Transformer Vision UNet for Medical Image Segmentation," arXiv [eess.IV], 2024 2024.
[29] D. He, Y. Li et al., "Dual-guided network for endoscopic image segmentation with region and boundary cues," vol. 91, p. 106059, 2024.
[30] T.-Y. Lin, P. Dollár et al., "Feature pyramid networks for object detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2117-2125.
[31] L.-C. Chen, G. Papandreou et al., "Rethinking atrous convolution for semantic image segmentation," 2017.
[32] H. Zhao, J. Shi et al., "Pyramid scene parsing network," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2881-2890, https://doi.org/10.1109/CVPR.2017.660.
[33] J. Hu, L. Shen, and G. Sun, "Squeeze-and-excitation networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7132-7141.
[34] S. Woo, J. Park et al., "CBAM: Convolutional Block Attention Module," arXiv [cs.CV], 2018 2018.
[35] T. Amit, T. Shaharbany et al., "Segdiff: Image segmentation with diffusion probabilistic models," 2021.
[36] J. Wolleb, R. Sandkühler et al., "Diffusion models for implicit image segmentation ensembles," in International Conference on Medical Imaging with Deep Learning, 2022, pp. 1336-1348: PMLR.
[37] Z. Wu, F. Lv et al., "Colorectal polyp segmentation in the deep learning era: A comprehensive survey," 2024, https://doi.org/10.48550/arXiv.2401.11734.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Journal of Computing and Electronic Information Management

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.








