Intestinal polyp image segmentation method based on improved CSWin Transformer

Abstract

Abstract: To address the issue of limited segmentation accuracy in traditional UNet-type intestinal polyp segmentation models, a modified CSWin Transformer-based model for intestinal polyp segmentation is proposed. The model consists of two main parts: an encoder and a decoder. Firstly, during the encoding stage, a CSWin Transformer with a cross window is utilized as the encoder to extract global context information from intestinal polyp images. The CBAM is introduced into each layer of the encoder's CSWin Transformer block to enhance the model's ability to capture polyp area and edge information. Secondly, in the decoding stage, the CSWin Transformer is also employed as the decoder, and the encoder and decoder are connected through skip connections. Finally, in the middle layers of the encoder and decoder, a self-aware attention module (SAA) is applied to establish non-local information interaction between features. Experimental results on the open-source Kvasir-SEG, CVC-ClinicDB, EndoTect, and CVC-ColonDB datasets show that the proposed method achieves Dice coefficients of 0.888, 0.927, 0.904, and 0.911, respectively, along with MIoU values of 0.876, 0.902, 0.831, and 0.860. Compared to the traditional U-shaped intestinal polyp segmentation model, the Dice coefficient and MIoU have increased by 2.1% and 2.5%.

Key words: intestinal polyp segmentation, CSWin Transformer, convolution block attention, non-local information interaction, deep learning

CLC Number:

TP391

ZHAO Hong, MI Shan, AN Ding. Intestinal polyp image segmentation method based on improved CSWin Transformer[J]. Journal of Lanzhou University of Technology, 2026, 52(2): 91-98.

Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks

URL: https://journal.lut.edu.cn/EN/

https://journal.lut.edu.cn/EN/Y2026/V52/I2/91

References

[1] SIEGEL R L,MILLER K D,WAGLE N S,et al.ColorectalCancer statistics 2023 [J].CA:A Cancer Journal for Clinicians,2023,73(3):233-254.
[2] 国家卫生健康委员会医改医管局,中华医学会肿瘤学分会.中国结直肠癌诊疗规范(2020年版) [M].北京:科学技术文献出版社,2021.
[3] MAGHSOUDI O H.Superpixel based segmentation and classification of polyps in wireless capsule endoscopy [C]//2017 IEEE Signal Processing in Medicine and Biology Symposium (SPMB).Philadelphia:IEEE,2017:1-4.
[4] JIA Y.Polyps auto-detection in wireless capsule endoscopy images using improved method based on image segmentation [C]//2015 IEEE International Conference on Robotics and Biomimetics (ROBIO).Zhuhai:IEEE,2015:1631-1636.
[5] SÁNCHEZ-GONZÁLEZ A,GARCIA-ZAPIRAIN B,SIERRA-SOSA D,et al.Colon polyp segmentation using texture analysis [C]//2018 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT).Louisville:IEEE,2018:579-588.
[6] FAN D P,JI G P,ZHOU T,et al.Pranet:parallel reverse attention network for polyp segmentation [C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Marrakesh:Springer International Publishing,2020:263-273.
[7] KANG J,GWAK J.Ensemble of instance segmentation models for polyp segmentation in colonoscopy images [J].IEEE Access,2019,7:26440-26447.
[8] JI G P,XIAO G,CHOU Y C,et al.Video polyp segmentation:a deep learning perspective [J].Machine Intelligence Research,2022,19(6):531-549.
[9] LU W,LAN C,NIU C,et al.A CNN-transformer hybrid model based on CSWin transformer for UAV image object detection [J].IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing,2023,16:1211-1231.
[10] WOO S,PARK J,LEE J Y,et al.CBAM:convolutional block attention module [C]//Proceedings of the European Conference on Computer Vision (ECCV).Munich:Computer Vision-ECCV,2018:3-19.
[11] TREBING K,STAŃCZYK T,MEHRKANOON S.SmaAt-UNet:precipitation nowcasting using a small attention-UNet architecture [J].Pattern Recognition Letters,2021,145:178-186.
[12] JHA D,SMEDSRUD P H,RIEGLER M A,et al.Kvasir-SEG:a segmented polyp dataset [C]//MultiMedia Modeling:26th International Conference.Daejeon:Springer International Publishing,2020:451-462.
[13] HICKS S A,JHA D,THAMBAWITA V,et al.The EndoTect 2020 challenge:evaluation and comparison of classification,segmentation and inference time for endoscopy [C]//ICPR International Workshops and Challenges.Online:Springer International Publishing,2021:263-274.
[14] BERNAL J,SÁNCHEZ F J,FERNÁNDEZ-ESPARRACH G,et al.WM-DOVA maps for accurate polyp highlighting in colonoscopy:validation vs.saliency maps from physicians [J].Computerized Medical Imaging and Graphics,2015,43:99-111.
[15] FERNÁNDEZ-ESPARRACH G,BERNAL J,LÓPEZ-CERÓN M,et al.Exploring the clinical potential of an automatic colonic polyp detection method based on the creation of energy maps [J].Endoscopy,2016,48(9):837-842.
[16] RONNEBERGER O,FISCHER P,BROX T.U-Net:convolutional networks for biomedical image segmentation [C]//Medical image computing and computer-assisted intervention-MICCAI 2015.Munich:Springer International Publishing,2015:234-241.
[17] ZHANG Z,LIU Q,WANG Y.Road extraction by deep residual U-Net [J].IEEE Geoscience and Remote Sensing Letters,2018,15(5):749-753.
[18] PETIT O,THOME N,RAMBOUR C,et al.U-net transformer:self and cross attention for medical image segmentation [C]//Machine Learning in Medical Imaging:12th International Workshop.Strasbourg:Springer International Publishing,2021:267-276.
[19] CAO H,WANG Y,CHEN J,et al.Swin-Unet:Unet-like pure transformer for medical image segmentation [C]//European Conference on Computer Vision.Munich:Springer Nature Switzerland,2023:205-218.
[20] WANG N,LIN S,LI X,et al.MISSU:3D medical image segmentation via self-distilling TransUNet [J].IEEETransactions on Medical Imaging,2023,42(9):2740-2750.