Skip to main content
Log in

YOLO-SS: optimizing YOLO for enhanced small object detection in remote sensing imagery

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The identification of minuscule objects in remote sensing data presents a formidable challenge in computer vision, where objects may occupy a mere handful of pixels. The lack of unique shape features in such small objects hinders the effectiveness of established object detection algorithms. Remote sensing of small object detection plays an important role in areas such as environmental monitoring and estimating agricultural production. To address this challenge, in this study, we introduce YOLO-SS, an enhanced version of the YOLO algorithm tailored specifically for small object detection in remote sensing imagery. YOLO-SS incorporates an optimized backbone network, a restructured loss function and an asymmetric training sample weighting strategy. These improvements prioritize the model’s attention toward high-quality positive samples of small objects while reducing sensitivity to complex backgrounds. Evaluation on the AI-TOD dataset demonstrates YOLO-SS’s exceptional performance, achieving an AP50 score of 0.535, surpassing YOLOv6L by 13.4% and other popular object detection algorithms. Our findings offer a novel pathway for advancing small object detection capabilities in diverse remote sensing applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

The AI-TOD dataset is publicly available. The AI-TOD dataset is ethical, and the paper is not conflict of interest. The link to the dataset is https://github.com/jwwangchn/AI-TOD. The datasets used and analyzed during the current study are available from the corresponding author or first author on reasonable request.

References

  1. Deng Z, Sun H, Zhou S, Zhao J, Lei L, Zou H (2018) Multi-scale object detection in remote sensing imagery with convolutional neural networks. ISPRS J Photogramm Remote Sens 145:3–22

    Article  Google Scholar 

  2. Zhang W, Wang S, Thachan S, Chen J, Qian Y (2018) Deconv r-cnn for small object detection on remote sensing images. In: IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium, pp. 2483–2486 . IEEE

  3. Yu J, Jiang Y, Wang Z, Cao Z, Huang T (2016) Unitbox: an advanced object detection network. In: proceedings of the 24th ACM International Conference on Multimedia, pp. 516–520

  4. Ma W, Wu Y, Cen F, Wang G (2020) Mdfn: multi-scale deep feature learning network for object detection. Pattern Recognit 100:107149

    Article  Google Scholar 

  5. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788

  6. Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271

  7. Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767

  8. Bochkovskiy, A, Wang, C-Y, Liao, H-YM.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)

  9. Mahendrakar T, White RT, Wilde M, Kish B, Silver I (2021) Real-time satellite component recognition with yolo-v5. In: Small Satellite Conference

  10. Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Ke Z, Li Q, Cheng M, Nie W, et al. (2022) Yolov6: a single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976

  11. Wang C-Y, Bochkovskiy A, Liao H-YM (2023) Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7464–7475

  12. Liu B, Wang M, Foroosh H, Tappen M, Pensky M (2015) Sparse convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 806–814

  13. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL. (2014) Microsoft coco: Common objects in context. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pp. 740–755. Springer

  14. Chen C, Liu M-Y, Tuzel O, Xiao J (2017) R-cnn for small object detection. In: Computer Vision–ACCV 2016: 13th Asian Conference on Computer Vision, Taipei, Taiwan, November 20-24, 2016, Revised Selected Papers, Part V 13, pp. 214–230. Springer

  15. Zhang H, Li M, Miao D, Pedrycz W, Wang Z, Jiang M (2023) Construction of a feature enhancement network for small object detection. Pattern Recognit 143:109801

    Article  Google Scholar 

  16. Graham S, Epstein D, Rajpoot N (2019) Rota-net: rotation equivariant network for simultaneous gland and lumen segmentation in colon histology images. In: Digital Pathology: 15th European Congress, ECDP 2019, Warwick, UK, April 10–13, 2019, Proceedings 15, pp. 109–116. Springer

  17. Kisantal M, Wojna Z, Murawski J, Naruniec J, Cho K (2019) Augmentation for small object detection. arXiv preprint arXiv:1902.07296

  18. Kim J-H, Hwang Y (2022) Gan-based synthetic data augmentation for infrared small target detection. IEEE Trans Geosci Remote Sens 60:1–12

    Google Scholar 

  19. Wang K, Liew JH, Zou Y, Zhou D, Feng J (2019) Panet: Few-shot image semantic segmentation with prototype alignment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9197–9206

  20. Chen J, Mai H, Luo L, Chen X, Wu K (2021) Effective feature fusion network in bifpn for small object detection. In: 2021 IEEE International Conference on Image Processing (ICIP), pp. 699–703. IEEE

  21. Qiao S, Chen L-C, Yuille A (2021) Detectors: detecting objects with recursive feature pyramid and switchable atrous convolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10213–10224

  22. Li Y, Chen Y, Wang N, Zhang Z (2019) Scale-aware trident networks for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6054–6063

  23. Fan D, Liu D, Chi W, Liu X, Li Y (2020) Improved ssd-based multi-scale pedestrian detection algorithm. In: advances in 3D Image and Graphics Representation, Analysis, Computing and Information Technology: algorithms and Applications, Proceedings of IC3DIT 2019, Volume 2, pp. 109–118. Springer

  24. Singh B. Davis LS (2018) An analysis of scale invariance in object detection snip. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3578–3587

  25. Singh B. Najibi M. Davis LS (2018) Sniper: efficient multi-scale training. Advances in neural information processing systems 31

  26. Li Y, Chen Y, Wang N, Zhang Z (2019) Scale-aware trident networks for object detection. In: proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6054–6063

  27. Bai X, Bi Y (2018) Derivative entropy-based contrast measure for infrared small-target detection. IEEE Trans Geosci Remote Sens 56(4):2452–2466

    Article  Google Scholar 

  28. Huang S, Liu Y, He Y, Zhang T, Peng Z (2019) Structure-adaptive clutter suppression for infrared small target detection: chain-growth filtering. Remote Sens 12(1):47

    Article  Google Scholar 

  29. Liang X, Zhang J, Zhuo L, Li Y, Tian Q (2019) Small object detection in unmanned aerial vehicle images using feature fusion and scaling-based single shot detector with spatial context analysis. IEEE Trans Circuits Syst Video Technol 30(6):1758–1770

    Article  Google Scholar 

  30. Lu X, Ji J, Xing Z, Miao Q (2021) Attention and feature fusion ssd for remote sensing object detection. IEEE Trans Instrument Measure 70:1–9

    Article  Google Scholar 

  31. Chen F, Gao C, Liu F, Zhao Y, Zhou Y, Meng D, Zuo W (2022) Local patch network with global attention for infrared small target detection. IEEE Trans Aerospace Electron Syst 58(5):3979–3991

    Article  Google Scholar 

  32. Hong M, Li S, Yang Y, Zhu F, Zhao Q, Lu L (2021) Sspnet: scale selection pyramid network for tiny person detection from uav images. IEEE Geosci Remote Sens Lett 19:1–5

    Google Scholar 

  33. Yang T-Y, Chen Y-T, Lin Y-Y, Chuang Y-Y (2019) Fsa-net: Learning fine-grained structure aggregation for head pose estimation from a single image. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1087–1096

  34. Du L, Wu W, Li C (2024) Super-resolution-assisted feature refined extraction for small objects in remote sensing images. In: International Conference on Multimedia Modeling, pp. 296–309. Springer

  35. Wu J, Xu S (2021) From point to region: accurate and efficient hierarchical small object detection in low-resolution remote sensing images. Remote Sens 13(13):2620

    Article  Google Scholar 

  36. Xu C, Wang J, Yang W, Yu H, Yu L, Xia G-S (2022) Rfla: Gaussian receptive field based label assignment for tiny object detection. In: European Conference on Computer Vision, pp. 526–543. Springer

  37. Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-iou loss: faster and better learning for bounding box regression. In: proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 12993–13000

  38. Zhang H, Wang Y, Dayoub F, Sunderhauf N (2021) Varifocalnet: an iou-aware dense object detector. In: proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8514–8523

  39. Wang J. Xu C. Yang W. Yu L (2021) A normalized gaussian wasserstein distance for tiny object detection. arXiv preprint arXiv:2110.13389

  40. Wang J. Yang W. Guo H. Zhang R. Xia G-S (2021) Tiny object detection in aerial images. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 3791–3798. IEEE

  41. Chen X, Liang C, Huang D, Real E, Wang K, Pham H, Dong X, Luong T, Hsieh C-J, Lu Y, et al. (2024) Symbolic discovery of optimization algorithms. Advances in Neural Information Processing Systems 36

  42. Liu H-I, Tseng Y-W, Chang K-C, Wang P-J, Shuai H-H, Cheng W-H (2024) A denoising fpn with transformer r-cnn for tiny object detection. IEEE Transactions on Geoscience and Remote Sensing

  43. Sunkara R, Luo T (2022) No more strided convolutions or pooling: A new cnn building block for low-resolution images and small objects. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 443–459. Springer

  44. Jeong J, Park H, Kwak N (2017) Enhancement of ssd by concatenating feature maps for object detection. arXiv preprint arXiv:1705.09587

  45. Yang Z, Liu S, Hu H, Wang L, Lin S (2019) Reppoints: Point set representation for object detection. In: proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9657–9666

  46. Tian Z, Shen C, Chen H, He, T (2019) Fcos: Fully convolutional one-stage object detection. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9626–9635. 10.1109/ICCV.2019.00972

  47. Zhou X, Wang D, Krähenbühl P (2019) Objects as points. arXiv preprint arXiv:1904.07850

  48. Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988

  49. Zhang S, Chi C, Yao Y, Lei Z, Li SZ (2020) dging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9759–9768

  50. Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: A metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 658–666

  51. Cai Z, Vasconcelos N (018) Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6154–6162

  52. Qiao S, Chen L, Yuille A (2020) b16: Detecting objects with recursive feature pyramid and switchable atrous convolution. CoRR

  53. Wang C, Yeh I, Liao H (2018) You only learn one representation: Unified network for multiple tasks. arXiv preprint arXiv:2105.04206

Download references

Author information

Authors and Affiliations

Authors

Contributions

Qiang Tang assisted with methodology, algorithm, writing and editing, Chang Shang helped with algorithm and writing, Kai Yang, Yuan Tian and Shibin Zhao carried out ablation study, and Wei Hao, Xubin Feng and Meilin Xie were responsible for financial support, laboratory equipment and experiment guidance.

Corresponding author

Correspondence to Meilin Xie.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tang, Q., Su, C., Tian, Y. et al. YOLO-SS: optimizing YOLO for enhanced small object detection in remote sensing imagery. J Supercomput 81, 303 (2025). https://doi.org/10.1007/s11227-024-06765-8

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-024-06765-8

Keywords