Remote sensing image object detection method based on enhanced YOLOv7

Abstract

Abstract: To address the challenges posed by large-scale small object detection, dense object distribution, and issues of missed detections and false positives in remote sensing images, this paper introduces a remote sensing image object detection approach grounded in an enhanced YOLOv7 model. Initially, the method integrates the DCNv2 structure and residual architecture into the YOLOv7 model, reconstructing a novel backbone network to enhance the extraction of shallow-level features information and improve network accuracy. Subsequently, a pioneering feature fusion module is incorporated into the neck network, combined with the SimAM mechanism, which adaptively adjusts the fusion weights of both shallow-level texture information and deep-level semantic information, thereby effectively curbing noise introduced during shallow feature extraction and augmenting the representation of essential features. Finally, the normalized Gaussian Wasserstein distance loss function is used as the regression loss function, replacing the traditional IOU to improve the detection capability for multi-scale targets. Empirical findings derived from the DOTAv1.0 dataset reveal an average precision of 20.1% for small objects, while the DIOR dataset yields an average precision of 29.0%. Furthermore, compared to recent advanced methods such as YOLOv7 and YOLOv6, the proposed algorithm demonstrates strong competitive performance.

Key words: remote sensing image, object detection, deformable convolutional networks, SimAM attention mechanism, Gaussian Wasserstein distance

CLC Number:

TP751

CHEN Hui, TIAN Bo, ZHAO Yong-hong, QU Hai-ping, LIANG Jian-hu. Remote sensing image object detection method based on enhanced YOLOv7[J]. Journal of Lanzhou University of Technology, 2026, 52(1): 93-100.

References

[1] 朱煜,方观寿,郑兵兵,等.基于旋转框精细定位的遥感目标检测方法研究 [J].自动化学报,2023,49(2):415-424.
[2] 王燕,李国臣,孙晓丽.基于多分类器融合的高光谱图像分类研究 [J].兰州理工大学学报,2022,48(1):98-106.
[3] LOWE D G.Distinctive image features from scale-invariant keypoints [J].International Journal of Computer Vision,2004,60:91-110.
[4] ATTARMOGHADDAM N,LI K F.An area-efficient FPGA implementation of a real-time multi-class classifier for binary images [J].IEEE Transactions on Circuits and Systems II:Express Briefs,2022,69(4):2306-2310.
[5] 吴鹏,徐洪玲,李雯霖,等.基于区域检测的多尺度Harris角点检测算法 [J].哈尔滨工程大学学报,2016,37(7):969-973.
[6] 戴博,吴晓峰,王斌.基于哈尔特征及结构复杂度的视觉显著性模型 [J].复旦学报(自然科学版),2014,53(5):651-658.
[7] 刘凤,刘浩哲,张文天,等.一种鲁棒的基于对抗结构的生物特征ROI提取方法 [J].自动化学报,2023,49(6):1339-1353.
[8] LI Y,ZHANG S,WANG W Q.A lightweight faster R-CNN for ship detection in SAR images [J].IEEE Geoscience and Remote Sensing Letters,2020,19:1-5.
[9] WANG Y,BASHIR S M A,KHAN M,et al.Remote sensing image super-resolution and object detection:benchmark and state of the art [J].Expert Systems with Applications,2022,197:116793.
[10] TANG L,TANG W,QU X,et al.A scale-aware pyramid network for multi-scale object detection in SAR images [J].Remote Sensing,2022,14(4):973.
[11] YANG X,SUN H,SUN X,et al.Position detection and direction prediction for arbitrary-oriented ships via multitask rotation region convolutional neural network [J].IEEE Access,2018,6:50839-50849.
[12] VAN ETTEN A.You only look twice:rapid multi-scale object detection in satellite imagery [J/OL].[2023-06-15].https://arxiv.org/pdf/1805.09512.
[13] CHEN S,ZHAN R,ZHANG J.Geospatial object detection in remote sensing imagery based on multiscale single-shot detector with activated semantics [J].Remote Sensing,2018,10(6):169-174.
[14] WANG P,SUN X,DIAO W,et al.FMSSD:feature-merged single-shot detection for multiscale objects in large-scale remote sensing imagery [J].IEEE Transactions on Geoscience and Remote Sensing,2019,58(5):3377-3390.
[15] SU H,WEI S,YAN M,et al.Object detection and instance segmentation in remote sensing imagery based on precise mask R-CNN [C]//2019 IEEE International Geoscience and Remote Sensing Symposium.Yokohama,Japan:IEEE,2019:1454-1457.
[16] FU Y,WU F,ZHAO J.Context-aware and depthwise-based detection on orbit for remote sensing image [C]//2018 24th International Conference on Pattern Recognition (ICPR).Beijing:[s.n.],2018:1725-1730.
[17] ZHU X,HU H,LIN S,et al.Deformable convnets v2:more deformable,better results [C]//Proceedings of the IEEE/CVF Conference on ComputerVision and Pattern Recognition.Long Beach,CA,USA:IEEE,2019:9308-9316.
[18] WANG C,NING X,SUN L,et al.Learning discriminative features by covering local geometric space for point cloud analysis [J].IEEE Transactions on Geoscience and Remote Sensing,2022,60:1-15.
[19] WANG D,LIU Z,GU X,et al.Automatic detection of pothole distress in asphalt pavement using improved convolutional neural networks [J].Remote Sensing,2022,14(16):3892.
[20] WANG J,XU C,YANG W,et al.A normalized Gaussian Wasserstein distance for tiny object detection [J/OL].[2023-06-15].https://arxiv.org/pdf/2110.13389.
[21] XIA G S,BAI X,DING J,et al.DOTA:a large-scale dataset for object detection in aerial images [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City,UT,USA:IEEE,2018:3974-3983.
[22] LI K,WAN G,CHENG G,et al.Object detection in optical remote sensing images:a survey and a new benchmark [J].ISPRS Journal of Photogrammetry and Remote Sensing,2020,159:296-307.
[23] WU Z,PAN S,CHEN F,et al.A comprehensive survey on graph neural networks [J].IEEE Transactions on Neural Networks and Learning Systems,2020,32(1):4-24.
[24] LV Z,WANG F,CUI G,et al.Spatial-spectral attention network guided with change magnitude image for land cover change detection using remote sensing images [J].IEEE Transactions on Geoscience and Remote Sensing,2022,60:1-12.
[25] REDMON J,FARHADI A.Yolov3:an incremental improvement [J/OL].[2023-05-04].https://arxiv.org/pdf.1804.02767.
[26] BOCHKOVSKIY A,WANG C Y,LIAO Y M.Yolov4:optimal speed and accuracy of object detection [J/OL].[2023-05-04].https://arxiv.org/pdf/2004.10934.
[27] Github.Glenn-jocher.yolov5 [EB/OL].(2022-11-22)[2023-05-04].https://github.com/ultralytics/yolov5.
[28] GE Z,LIU S,WANG F,et al.Yolox:exceeding yolo series in 2021 [J/OL].[2023-05-04].https://arxiv.org/pdf/2107.08430.
[29] LI C,LI L,JIANG H,et al.YOLOv6:a single-stage object detection framework for industrial applications [J/OL].[2023-05-04].https://arxiv.org/pdf/2209.02976.
[30] WANG C Y,BOCHKOVSKIY A,LIAO H Y M.YOLOv7:trainable bag-of-freebies sets new state-of-the-art for real-time object detectors [J/OL].[2023-05-04].https://arxiv.org/pdf/2207.02696.