Deep learning for the analysis of medical images in endoscopy: approaches to early diagnosis

 
Audio is AI-generated
0

Abstract

Specular highlights, mucosal folds, and scale changes in endoscopic frames make polyp boundaries visually unstable, so a segmentation model must capture small lesions while avoiding mask leakage beyond the lesion area. The objective was to improve binary polyp segmentation by introducing an additional geometric cue that encodes pixel proximity to the object boundary, while keeping post-processing simple and reproducible. The hypothesis stated that adding a boundary distance map (denoted as φ, «phi») and incorporating it into the loss design would increase sensitivity to small polyps and raise recall without destabilizing performance on medium and large lesions. The study compared two variants of the same backbone: a U-shaped convolutional network (U-Net) with a residual network (ResNet-34) as the encoder, trained under comparable optimization settings with early stopping. Materials and methods involved training a baseline model and a corrected φ-fixed model where φ was computed consistently with the ground-truth masks; test evaluation used the Sørensen–Dice coefficient and the Jaccard index to quantify overlap between predicted and reference masks, along with recall, precision, and pixel-wise false positive/false negative fractions. The binarization threshold was selected on the validation split, post-processing retained the largest connected component, and confidence intervals for metric differences were estimated via sequence-level bootstrap. Results demonstrated a consistent improvement for φ-fixed (validation-optimal threshold 0.2) over the baseline (threshold 0.8) on the test set: Dice increased from 0.6642 to 0.7002, Jaccard from 0.5905 to 0.6295, and recall from 0.6154 to 0.7723. The bootstrap estimate of the difference (φ-fixed minus baseline) yielded +0.0361 for Dice with a 95% interval of [+0.0113, +0.0640] and +0.2150 for recall with [+0.0899, +0.3962], while precision decreased by −0.1537 with [−0.2541, −0.0663], reflecting a shift toward fewer misses at the cost of more extra detections. Conclusions indicate that the φ cue provides a practical gain in a recall-oriented operating regime: φ-fixed improves mask overlap and substantially raises recall, with a controlled increase in false positives. Validation-driven threshold selection remains essential because φ-fixed changes the error trade-off and benefits from a lower binarization threshold to realize the recall advantage.

General Information

Keywords: polyp segmentation, endoscopy frames, convolutional neural networks, U-Net, ResNet-34, boundary distance map, Sørensen–Dice coefficient, Jaccard index, thresholding, connected-component post-processing

Journal rubric: Data Analysis

Article type: scientific article

DOI: https://doi.org/10.17759/mda.2026160102

Received 26.12.2025

Revised 15.01.2026

Accepted

Published

For citation: Almusawi, M.R.K., Lyapuntsova, E.V. (2026). Deep learning for the analysis of medical images in endoscopy: approaches to early diagnosis. Modelling and Data Analysis, 16(1), 27–49. (In Russ.). https://doi.org/10.17759/mda.2026160102

© Almusawi M.R.K., Lyapuntsova E.V., 2026

License: CC BY-NC 4.0

References

  1. Ачкасов, С.И., Шелыгин, Ю.А., Ликутов, А.А., Шахматов, Д.Г., Югай, О.М., Назаров, И.В., Савицкая, Т.А., Мингазов, А.Ф. (2024). Эффективность эндоскопической диагностики новообразований ободочной кишки с использованием искусственного интеллекта: проспективное тандемное исследование. Колопроктология, 23(2), 28–34. https://doi.org/10.33878/2073-7556-2024-23-2-28-34
    Achkasov, S.I., Shelygin, Yu.A., Likutov, A.A., Shakhmatov, D.G., Yugai, O.M., Nazarov, I.V., Savitskaya, T.A., Mingazov, A.F. (2024). Effectiveness of endoscopic diagnostics of colon neoplasms using artificial intelligence: a prospective tandem study. Koloproktologia, 23(2), 28–34. (In Russ.). https://doi.org/10.33878/2073-7556-2024-23-2-28-34
  2. A Survey on Deep Learning for Polyp Segmentation. (2023). arXiv:2311.18373. https://doi.org/10.48550/arXiv.2311.18373 (viewed: 25.01.2026).
  3. Ali, S., Ghatwary, N., Jha, D., Realdon, S., et al. (2023). A multi-centre polyp detection and segmentation dataset for generalisability assessment. Scientific Data, 10, Article 75. https://doi.org/10.1038/s41597-023-01981-y
  4. Bernal, J., Sánchez, F.J., Fernández-Esparrach, G., Gil, D., Rodríguez, C., Vilariño, F. (2015). WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Computerized Medical Imaging and Graphics, 40, 99–111. https://doi.org/10.1016/j.compmedimag.2015.02.007
  5. Borgli, H., Thambawita, V., Smedsrud, P.H., et al. (2020). HyperKvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy. Scientific Data, 7(1), Article 283. https://doi.org/10.1038/s41597-020-00622-y
  6. Dong, S., Yao, L., Li, E., Zhang, D., Zhang, D. (2023). Polyp-PVT: Polyp segmentation with pyramid vision transformer. CAAI Artificial Intelligence Research, 2, 9150015. https://doi.org/10.26599/AIR.2023.9150015
  7. Fan, D.-P., Ji, G.-P., Zhou, T., Chen, G., Fu, H., Shen, J., Shao, L. (2020). PraNet: Parallel Reverse Attention Network for Polyp Segmentation. arXiv:2006.11392. https://doi.org/10.48550/arXiv.2006.11392 (viewed: 25.01.2026).
  8. Guo, Z., Bernal, J., Matuszewski, B.J. (2020). Polyp segmentation with fully convolutional deep neural networks–extended evaluation study. Journal of Imaging, 6(7), Article 69. https://doi.org/10.3390/jimaging6070069
  9. He, K., Zhang, X., Ren, S., Sun, J. (2016). Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 770–778). https://doi.org/10.1109/CVPR.2016.90
  10. Huang, C.-H., Wu, H.-Y., Lin, Y.-L. (2021). HarDNet-MSEG: A simple encoder-decoder polyp segmentation neural network that achieves over 0.9 Mean Dice and 86 FPS. arXiv:2101.07172. https://doi.org/10.48550/arXiv.2101.07172 (viewed: 25.01.2026).
  11. Isensee, F., Jaeger, P.F., Kohl, S.A.A., Petersen, J., Maier-Hein, K.H. (2021). nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation. Nature Methods, 18, 203–211. https://doi.org/10.1038/s41592-020-01008-z
  12. Jha, D., Smedsrud, P.H., Johansen, D., de Lange, T., Johansen, H.D., Halvorsen, P., Riegler, M.A. (2021). A comprehensive study on colorectal polyp segmentation with ResUNet++, conditional random field and test-time augmentation. IEEE Journal of Biomedical and Health Informatics, 25(6), 2029–2040. https://doi.org/10.1109/JBHI.2021.3049304
  13. Jha, D., Smedsrud, P.H., Riegler, M.A., Halvorsen, P., de Lange, T., Johansen, D., Johansen, H.D. (2019). Kvasir-SEG: A segmented polyp dataset. arXiv:1911.07069. https://doi.org/10.48550/arXiv.1911.07069 (viewed: 25.01.2026).
  14. Ji, G.-P. (2022). Video polyp segmentation: A deep learning perspective. Machine Intelligence Research. https://doi.org/10.1007/s11633-022-1371-y URL: https://www.research-collection.ethz.ch/bitstream/handle/20.500.11850/581810/2/s11633-022-1371-y.pdf (viewed: 25.01.2026).
  15. (n.d.). CVC-ClinicDB Dataset. URL: https://www.kaggle.com/datasets/balraj98/cvcclinicdb (viewed: 25.01.2026).
  16. Karimi, D., Salcudean, S.E. (2019). Reducing the Hausdorff distance in medical image segmentation with convolutional neural networks. arXiv:1904.10030. https://doi.org/10.48550/arXiv.1904.10030 (viewed: 25.01.2026).
  17. Kervadec, H., Bouchtiba, J., Desrosiers, C., Granger, E., Dolz, J., Ayed, I.B. (2021). Boundary loss for highly unbalanced segmentation. Medical Image Analysis, 67, 101851. https://doi.org/10.1016/j.media.2020.101851
  18. Liu, D., et al. (2024). NA-segformer: A multi-level transformer model based on neighborhood attention for colonoscopic polyp segmentation. Scientific Reports, 14(1), 22527. https://doi.org/10.1038/s41598-024-74123-y
  19. Lou, C., Wang, Y., Zhang, M., Wang, Y., Wang, J., Ding, Y. (2023). CaraNet: Context axial reverse attention network for segmentation of small polyps. Journal of Medical Imaging, 10(1), 014005. https://doi.org/10.1117/1.JMI.10.1.014005
  20. Polyp SAM 2: Endoscopic Polyp Segmentation via SAM2. (2024). arXiv:2408.05892. https://doi.org/10.48550/arXiv.2408.05892 (viewed: 25.01.2026).
  21. Repici, A., Badalamenti, M., Maselli, R., et al. (2020). Efficacy of real-time computer-aided detection of colorectal neoplasia in a randomized trial. Gastroenterology, 159(2), 512–520.e7. https://doi.org/10.1053/j.gastro.2020.04.062
  22. Ronneberger, O., Fischer, P., Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. arXiv:1505.04597. https://doi.org/10.48550/arXiv.1505.04597 (viewed: 25.01.2026).
  23. Smedsrud, P.H., Thambawita, V., Hicks, S.A., et al. (2021). Kvasir-Capsule, a video capsule endoscopy dataset. Scientific Data, 8, Article 142. https://doi.org/10.1038/s41597-021-00920-7
  24. Tudela, A., et al. (2024). A comparative benchmark of AI tools for colorectal polyp screening: Detection, segmentation, and classification. Frontiers in Oncology, 14, 1417862. https://doi.org/10.3389/fonc.2024.1417862
  25. van Rijn, J.C., Reitsma, J.B., Stoker, J., et al. (2006). Polyp miss rate determined by tandem colonoscopy: a systematic review. The American Journal of Gastroenterology, 101(2), 343–350. https://doi.org/10.1111/j.1572-0241.2006.00390
  26. Wang, P., Berzin, T.M., Glissen Brown, J.R., et al. (2019). Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study. Gut, 68(10), 1813–1819. https://doi.org/10.1136/gutjnl-2018-317500

Information About the Authors

Mustafa R. Almusawi, graduate student, National University of Science and Technology "MISIS", Moscow, Russian Federation, e-mail: adammadam265@gmail.com

Elena V. Lyapuntsova, Doctor of Engineering, Professor of the Department of Computer-Aided Design and Engineering, National University of Science and Technology "MISIS", Professor, Bauman Moscow State Technical University, Moscow, Russian Federation, ORCID: https://orcid.org/0000-0002-3420-3805, e-mail: lev86@bmstu.ru

Contribution of the authors

All authors participated in the discussion of the results and approved the final text of the manuscript.

Conflict of interest

The authors declare no conflict of interest.

Metrics

 Web Views

Whole time: 0
Previous month: 0
Current month: 0

 PDF Downloads

Whole time: 0
Previous month: 0
Current month: 0

 Total

Whole time: 0
Previous month: 0
Current month: 0