Abstract
Traditional target detection algorithms frequently encounter challenges in accurately detecting objects within complex, cluttered environments. This paper presents an optimized YOLOv5-based model to mitigate such limitations. Our contributions are threefold: Firstly, we enhance the upsampling procedure by amalgamating transposed convolution with the CBAM attention mechanism, fortifying the network’s fine-grained feature extraction capabilities. Secondly, we introduce an optimized feature-processing module, which enhances feature utilization while maintaining a lightweight architecture. Lastly, we integrate EfficientNet into the backbone architecture to amplify feature extraction performance. We validate our approach using the PASCAL VOC dataset, achieving an mAP0.5 of 84.00% and an mAP0.5:0.95 of 62.10%, while maintaining a modest parameter size of 13.22MB. These results mark an improvement of 4.50% ± 0.12% and 8.20% ± 0.09% over the benchmark, demonstrating an efficient trade-off between computational efficiency and detection accuracy. The proposed model outperforms conventional YOLOv5 algorithms and remains competitive with contemporary state-of-the-art object detection techniques. Code is available at https://github.com/chenxz0906chenxz/YOLO-TUF/.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Adibhatla, V.A., Chih, H.C., Hsu, C.C., Cheng, J., Abbod, M.F., Shieh, J.S.: Defect detection in printed circuit boards using you-only-look-once convolutional neural networks. Electronics 9(9), 1547 (2020)
Amanatiadis, A., Andreadis, I.: A survey on evaluation methods for image interpolation. Meas. Sci. Technol. 20(10), 104015 (2009)
Arun, P.V.: A comparative analysis of different dem interpolation methods. Egypt. J. Remote Sens. Space Sci. 16(2), 133–139 (2013)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Chen, Y., Yang, X., Zhong, B., Pan, S., Chen, D., Zhang, H.: CNNTracker: online discriminative object tracking via deep convolutional neural network. Appl. Soft Comput. 38, 1088–1098 (2016)
Farhadi, A., Redmon, J.: YOLOv3: an incremental improvement. In: Computer Vision and Pattern Recognition, vol. 1804. Springer, Heidelberg (2018)
Fayaz, S., Parah, S.A., Qureshi, G., Kumar, V.: Underwater image restoration: a state-of-the-art review. IET Image Proc. 15(2), 269–285 (2021)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Jocher, G., et al.: Ultralytics/yolov5: v7. 0-yolov5 sota realtime instance segmentation. Zenodo (2022)
Kisantal, M., Wojna, Z., Murawski, J., Naruniec, J., Cho, K.: Augmentation for small object detection. arXiv preprint arXiv:1902.07296 (2019)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
Li, F., Chen, H., Liu, Z., Zhang, X., Wu, Z.: Fully automated detection of retinal disorders by image-based deep learning. Graefes Arch. Clin. Exp. Ophthalmol. 257, 495–505 (2019)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Liu, Y., Sun, P., Wergeles, N., Shang, Y.: A survey and performance evaluation of deep learning methods for small object detection. Expert Syst. Appl. 172, 114602 (2021)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Pang, J., Li, C., Shi, J., Xu, Z., Feng, H.: R2CNN: fast tiny object detection in large-scale remote sensing images. IEEE Trans. Geosci. Remote Sens. 57(8), 5512–5524 (2019)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Sun, K., Wen, Q., Zhou, H.: Ganster R-CNN: occluded object detection network based on generative adversarial nets and faster R-CNN. IEEE Access 10, 105022–105030 (2022)
Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)
Tong, K., Wu, Y., Zhou, F.: Recent advances in small object detection based on deep learning: a review. Image Vis. Comput. 97, 103910 (2020)
Wang, S., et al.: Artificial intelligence in lung cancer pathology image analysis. Cancers 11(11), 1673 (2019)
Yang, Y., Zhou, Y., Din, N.U., Li, J., He, Y., Zhang, L.: An improved YOLOv5 model for detecting laser welding defects of lithium battery pole. Appl. Sci. 13(4), 2402 (2023)
Yue, L., Shen, H., Li, J., Yuan, Q., Zhang, H., Zhang, L.: Image super-resolution: the techniques, applications, and future. Signal Process. 128, 389–408 (2016)
Zhang, Q., Zhang, H., Lu, X.: Adaptive feature fusion for small object detection. Appl. Sci. 12(22), 11854 (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chen, H., Yang, W., Wang, W., Liu, Z. (2024). YOLO-TUF: An Improved YOLOv5 Model for Small Object Detection. In: Jin, H., Pan, Y., Lu, J. (eds) Artificial Intelligence and Machine Learning. IAIC 2023. Communications in Computer and Information Science, vol 2058. Springer, Singapore. https://doi.org/10.1007/978-981-97-1277-9_37
Download citation
DOI: https://doi.org/10.1007/978-981-97-1277-9_37
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-1276-2
Online ISBN: 978-981-97-1277-9
eBook Packages: Computer ScienceComputer Science (R0)