兰州理工大学学报 ›› 2026, Vol. 52 ›› Issue (1): 93-100.

• 自动化技术与计算机技术 • 上一篇    下一篇

基于改进YOLOv7的遥感图像目标检测方法

陈辉*1, 田博1, 赵永红2, 瞿海平2, 梁建虎2   

  1. 1.兰州理工大学 自动化与电气工程学院, 甘肃 兰州 730050;
    2.甘肃长风电子科技有限公司, 甘肃 兰州 730070
  • 收稿日期:2023-07-30 出版日期:2026-02-28 发布日期:2026-03-05
  • 通讯作者: 陈辉(1978-),男,山西闻喜人,博士,教授,博导.Email:huich78@hotmail.com
  • 基金资助:
    国家自然科学基金(62163023,62366031,62363023),甘肃省基础研究创新群体(25JRRA058),中央引导地方科技发展资金(25ZYJA040),甘肃省重点人才项目(2024RCXM86),甘肃省军民融合发展专项资金

Remote sensing image object detection method based on enhanced YOLOv7

CHEN Hui1, TIAN Bo1, ZHAO Yong-hong2, QU Hai-ping2, LIANG Jian-hu2   

  1. 1. School of Automation and Electrical Engineering, Lanzhou University of Technology, Lanzhou 730050, China;
    2. Gansu Province Changfeng Electronic Technology Co., LTD, Lanzhou 730070, China
  • Received:2023-07-30 Online:2026-02-28 Published:2026-03-05

摘要: 为了解决遥感图像中小目标规模大、目标分布密集以及容易产生漏检和误检等问题,提出了一种基于改进YOLOv7模型的遥感图像目标检测方法.该方法首先在YOLOv7模型中引入DCNv2结构和残差结构,重新构建了新的骨干网络,以增强目标浅层特征信息的提取,并提高网络的准确性.其次,在颈部网络中采用新的特征融合模块,并通过SimAM注意力机制,自适应调节浅层特征的纹理信息和深层语义信息的融合权重,更有针对性地抑制提取浅层特征时带来的噪声.最后,采用归一化高斯瓦瑟斯坦距离损失作为模型的回归损失函数,取代传统的IOU,以提高多尺度目标的检测能力.该算法在DOTAv1.0数据集上小目标平均精度达到20.1%,在DIOR数据集上小目标平均精度达到29.0%.同时,与YOLOv7、YOLOv6等方法相比,该算法展现出了较强的竞争力.

关键词: 遥感图像, 目标检测, 可变形卷积网络, SimAM注意力机制, 高斯瓦瑟斯坦距离

Abstract: To address the challenges posed by large-scale small object detection, dense object distribution, and issues of missed detections and false positives in remote sensing images, this paper introduces a remote sensing image object detection approach grounded in an enhanced YOLOv7 model. Initially, the method integrates the DCNv2 structure and residual architecture into the YOLOv7 model, reconstructing a novel backbone network to enhance the extraction of shallow-level features information and improve network accuracy. Subsequently, a pioneering feature fusion module is incorporated into the neck network, combined with the SimAM mechanism, which adaptively adjusts the fusion weights of both shallow-level texture information and deep-level semantic information, thereby effectively curbing noise introduced during shallow feature extraction and augmenting the representation of essential features. Finally, the normalized Gaussian Wasserstein distance loss function is used as the regression loss function, replacing the traditional IOU to improve the detection capability for multi-scale targets. Empirical findings derived from the DOTAv1.0 dataset reveal an average precision of 20.1% for small objects, while the DIOR dataset yields an average precision of 29.0%. Furthermore, compared to recent advanced methods such as YOLOv7 and YOLOv6, the proposed algorithm demonstrates strong competitive performance.

Key words: remote sensing image, object detection, deformable convolutional networks, SimAM attention mechanism, Gaussian Wasserstein distance

中图分类号: