兰州理工大学学报 ›› 2025, Vol. 51 ›› Issue (5): 92-99.

• 自动化技术与计算机技术 • 上一篇    下一篇

面向多模态皮肤病语料库的可变形分区注意力黑色素瘤识别方法

林玉萍1, 刘梦皎2, 王明豪2, 张栋2, 许美凤*3, 李策4   

  1. 1.西安交通大学 外国语学院, 陕西 西安 710049;
    2.西安交通大学 人工智能学院, 陕西 西安 710049;
    3.西安交通大学第二附属医院 皮肤科, 陕西 西安 710004;
    4.兰州理工大学 自动化与电气工程学院, 甘肃 兰州 730050
  • 收稿日期:2025-03-14 发布日期:2025-10-25
  • 通讯作者: 许美凤(1977-),女,福建漳州人,副主任医师.Email:xumf96@163.com
  • 基金资助:
    国家自然科学基金(62363025),陕西省社会科学基金(2021K014)

Deformable partition attention based melanoma recognition method for multimodal skin disease corpus

LIN Yu-ping1, LIU Meng-jiao2, WANG Ming-hao2, ZHANG Dong2, XU Mei-feng3, LI Ce4   

  1. 1. School of Foreign Studies,Xi’an Jiaotong University, Xi’an 710049, China;
    2. College of Artificial Intelligence, Xi’an Jiaotong University, Xi’an 710049, China;
    3. Department of Dermatology, Second Afflicated Hospital of Xi’an Jiaotong University, Xi’an 710004, China;
    4. School of Automation and Electrical Engineering, Lanzhou University of Technology, Lanzhou 730050, China
  • Received:2025-03-14 Published:2025-10-25

摘要: 针对黑色素瘤图像诊断问题,提出一种基于可变形分区注意力机制的黑色素瘤识别方法.该方法采用由粗到细的特征提取与识别策略准确区分黑色素瘤和普通痣并建立相应语义标签,在此基础上结合病例文本构建多模态皮肤病语料库.首先,为解决良性与恶性子类别间差异过大导致模型训练困难及识别效率低的问题,构建了一个从粗类到细类层级深入的学习架构;其次,针对病灶边缘模糊、分布不均以及特征提取难的问题,提出了一种融合注意力机制与可变形卷积的可变形分区注意力模块,通过由粗到细的特征提取策略实现了全局与局部特征的有效结合;此外,引入了联合损失函数优化模型识别精准性.实验结果表明,该算法在自建数据集上展现了高敏感性和高特异性,有效提升了病例文本和医学影像匹配构建多模态皮肤病语料库的准确性.

关键词: 医学图像处理, 黑色素瘤识别, 可变形卷积, 注意力机制, 深度学习, 多模态语料库

Abstract: For the problem of melanoma image diagnosis, this paper proposes a recognition method based on a deformable partition attention mechanism. This method adopts a coarse-to-fine feature extraction and recognition strategy to accurately distinguish melanoma from common moles and establish corresponding semantic labels. Based on this, a multimodal dermatological corpus is constructed by integrating case texts. First, to tackle issues such as blurred lesion boundaries, uneven distribution, and difficulty in feature extraction, this paper introduces a deformable partition attention module that combines attention mechanisms with deformable convolutions. Second, to address the large differences between benign and malignant subcategories that result in training difficulties and low recognition efficiency, this paper constructs a hierarchical learning framework that progresses from coarse to fine categories. In addition, a joint loss function is introduced to optimize the model’s recognition accuracy. Experimental results show that the proposed algorithm demonstrates high sensitivity and specificity on self-constructed dataset, effectively improving the accuracy of multimodal dermatological corpus construction by matching case texts with medical images.

Key words: medical image processing, melanoma recognition, deformable convolution, attention mechanism, deep learning, multimodal corpus

中图分类号: