Abstract
As an essential object detection application, road damage detection aims to identify and mark road damage. Timely maintenance of detected damage can improve road safety. However, the proportions of damage area of the image is very diffident for the variety of the road damage textures and shapes. Additionally it is a challenge to localize the road damage accurately for the blurring of the damaged regions caused by the external environmental factors. In this study, we propose a Road Damage Detector with a Local Sensing Feature Network (LSF-RDD), which constructs a Local Sensing Feature Network (LSF-Net) as a neck to fuse multi-scale features extracted from the backbone network and can focus on the location of the damaged area. First, the CSP-Darknet53 backbone network extracts the feature maps of three scales layer-by-layer from the input images. Second, these three feature maps are input into LSF-Net for multi-scale feature fusion to generate three local feature representations. LSF-Net comprises four interconnected blocks, enabling top-down and bottom-up feature fusion. Feature maps from the backbone perform multi-scale feature fusion through connections between different blocks. Finally, three local feature representations are sent into the detection head for detection. Experiments show that LSF-RDD performs well on the adopted datasets, especially on the China_motorbike dataset of RDD2022, with mAP@0.5 reaching 94.4%.












Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
All data, models, and code generated and utilized in this study are available upon reasonable request from the corresponding author. The codes is available at https://github.com/yangwygithub/PaperCode.git, Branch: QihanHe_LSF-RDD_RoadDamageDetection2023.
References
Zou Z, Chen K, Shi Z, Guo Y, Ye J (2023) Object detection in 20 years: a survey. Proc IEEE 111(3):257–276
Dong Y, Kang C, Zhang J, Zhu Z, Wang Y, Yang X, Su H, Wei X, Zhu J (2023) Benchmarking robustness of 3d object detection to common corruptions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1022–1032
Liu F, Wu Y, Yang X, Mo Y, Liao Y (2022) Identification of winter road friction coefficient based on multi-task distillation attention network. Pattern Anal Appl 25(2):441–449
Vareto RH, Schwartz WR (2021) Face spoofing detection via ensemble of classifiers toward low-power devices. Pattern Anal Appl 24(2):511–521
Paul SK, Bouakaz S, Rahman CM, Uddin MS (2021) Component-based face recognition using statistical pattern matching analysis. Pattern Anal Appl 24:299–319
Li G, Hao X, Zha L, Chen A (2022) An outstanding adaptive multi-feature fusion yolov3 algorithm for the small target detection in remote sensing images. Pattern Anal Appl 25(4):951–962
Li Z, He Q, Yang W (2024) E-fpn: An enhanced feature pyramid network for uav scenarios detection. The Visual Computer, 1–19
Navaneethakrishnan M, Anand MV, Vasavi G, Rani VV (2023) Deep fuzzy segnet-based lung nodule segmentation and optimized deep learning for lung cancer detection. Pattern Anal Appl 26:1143–1159
Tang X, Yu H (2023) Researches advanced in medical detection based on deep learning. In: Third International Conference on Intelligent Computing and Human Computer Interaction, pp. 651–662
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, 28
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788
Redmon J, Farhadi A (2017) Yolo9000: Better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767
Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934
Jocher G (2020) Yolov5 by ultralytics https://doi.org/10.5281/zenodo.3908559
Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Ke Z, Li Q, Cheng M, Nie W et al (2022) Yolov6: A single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976
Wang CY, Bochkovskiy A, Liao HYM (2023) Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7464–7475
Jocher G, Chaurasia A, Qiu J (2023) Yolo by ultralytics
Wang CY, Liao H, Wu YH, Chen PY, Yeh IH (2020) Cspnet: A new backbone that can enhance learning capability of cnn. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 390–391
Woo S, Park J, Lee JY, Kweon IS (2018) Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision, pp. 3–19
Ding X, Zhang X, Ma N, Han J, Ding G, Sun J (2021) Repvgg: Making vgg-style convnets great again. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13733–13742
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141
Gao Z, Xie J, Wang Q, Li P (2019) Global second-order pooling convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3024–3033
Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) Eca-net: Efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11534–11542
Yang Z, Zhu L, Wu Y, Yang Y (2020) Gated channel transformation for visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11794–11803
Mnih V, Heess N, Graves A, kavukcuoglu k (2014) Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, vol. 27
Jaderberg M, Simonyan K, Zisserman A, kavukcuoglu k (2015) Spatial transformer networks. In: Advances in Neural Information Processing Systems, vol. 28
Hu J, Shen L, Albanie S, Sun G, Vedaldi A (2018) Gather-excite: exploiting feature context in convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 31
Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164
Park J, Woo S, Lee JY, Kweon IS (2018) Bam: bottleneck attention module. arXiv preprint arXiv:1807.06514
Yu J, Jiang Y, Wang Z, Cao Z, Huang T (2016) Unitbox: an advanced object detection network. In: Proceedings of the 24th ACM International Conference on Multimedia, pp. 516–520
Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 658–666
Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-iou loss: faster and better learning for bounding box regression. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 12993–13000
Tong Z, Chen Y, Xu Z, Yu R (2023) Wise-iou: bounding box regression loss with dynamic focusing mechanism. arXiv preprint arXiv:2301.10051
Siliang M, Yong X (2023) Mpdiou: a loss for efficient and accurate bounding box regression. arXiv preprint arXiv:2307.07662
Koch C, Brilakis I (2011) Pothole detection in asphalt pavement images. Adv Eng Inform 25(3):507–515
Zou Q, Cao Y, Li Q, Mao Q, Wang S (2012) Cracktree: automatic crack detection from pavement images. Pattern Recogn Lett 33(3):227–238
Maeda H, Sekimoto Y, Seto T, Kashiyama T, Omata H (2018) Detection and classification using deep neural networks with smartphone images. Comput Aided Civil Infrastruct Eng 33(12):1127–1141
Gopalakrishnan K, Khaitan SK, Choudhary A, Agrawal A (2017) Deep convolutional neural networks with transfer learning for computer vision-based data-driven pavement distress detection. Constr Build Mater 157:322–330
Shim S, Kim J, Lee SW, Cho GC (2022) Road damage detection using super-resolution and semi-supervised learning with generative adversarial network. Autom Constr 135:104139–104149
Goodfellow I, Pouget Abadie J, Mirza M, Xu B, Warde Farley D, Ozair S, Courville A, Bengio Y (2020) Generative adversarial networks. Commun ACM 63(11):139–144
Roy A, Bhaduri J (2023) A computer vision enabled damage detection model with improved yolov5 based on transformer prediction head. arXiv preprint arXiv:2303.04275
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022
Guo G, Zhang Z (2022) Road damage detection algorithm for improved yolov5. Sci Rep 12(1):15523–15533
Howard A, Sandler M, Chu G, Chen LC, Chen B, Tan M, Wang W, Zhu Y, Pang R, Vasudevan V et al (2019) Searching for mobilenetv3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314–1324
Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13713–13722
Arya D, Maeda H, Ghosh SK, Toshniwal D, Sekimoto Y (2022) Rdd2022: a multi-national image dataset for automatic road damage detection. arXiv preprint arXiv:2209.08538
Arya D, Maeda H, Ghosh SK, Toshniwal D, Mraz A, Kashiyama T, Sekimoto Y (2021) Deep learning-based road damage detection and classification for multiple countries. Autom Constr 132:103935–103945
Arya D, Maeda H, Ghosh SK, Toshniwal D, Sekimoto Y (2021) Rdd 2020: an annotated image dataset for automatic road damage detection using deep learning. Data brief 36:107133–107143
Arya D, Maeda H, Ghosh SK, Toshniwal D, Omata H, Kashiyama T, Sekimoto Y (2020) Global road damage detection: State-of-the-art solutions. In: 2020 IEEE International Conference on Big Data, pp. 5533–5539
Arya D, Maeda H, Ghosh SK, Toshniwal D, Omata H, Kashiyama T, Sekimoto Y (2022) Crowdsensing-based road damage detection challenge. In: 2022 IEEE International Conference on Big Data, pp. 6378–6386
Nanting: pavement disease product dataset. (2022) https://aistudio.baidu.com/datasetdetail/140177/0
Kaggle, Basily A (2020) Road damage. https://www.kaggle.com/datasets
Roboflow, LeeJIMIN: Crack detection v2. (2021) https://universe.roboflow.com/lee-jimin-zo6tg/crack-detection
Sunkara R, Luo T (2022) No more strided convolutions or pooling: a new cnn building block for low-resolution images and small objects. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 443–459
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768
Gevorgyan Z (2022) Siou loss: more powerful learning for bounding box regression. arXiv preprint arXiv:2205.12740
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
No potential Conflict of interest was reported by the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
He, Q., Li, Z. & Yang, W. Lsf-rdd: a local sensing feature network for road damage detection. Pattern Anal Applic 27, 99 (2024). https://doi.org/10.1007/s10044-024-01314-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10044-024-01314-8