Skip to main content

One-Dimensional Feature Supervision Network for Object Detection

  • Conference paper
  • First Online:
Advanced Intelligent Computing Technology and Applications (ICIC 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14090))

Included in the following conference series:

Abstract

Self-attention mechanisms have been widely used in object detection tasks to distinguish the importance of different channels and reinforce important information in features, and also leads to the exciting results at all scales. However, most of the self-attentive mechanisms, as well as their variants, focus only on the channel dimension and thus easily ignore the wide and high dimensions of the feature map that play an important role in capturing local contextual information. To alleviate this problem, in this paper we propose an one-dimensional feature supervision network for object detection (1DSNet). Specifically, we first propose an one-dimensional feature supervision module (1DSM). It uses a lightweight one-dimensional feature vector to weight the features from the width and height perspectives, respectively, for jointly reinforcing the important information in the features. Moreover, in order to improve the representation of multi-scale feature context information, we construct a receptive field dilated pyramid pooling (RFD-SPP) that can obtain a larger field of view based on the spatial pyramid pooling. Finally, experimental results demonstrate that our proposed 1DSNet is effective and competitive when compared with some representative methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Qiao, S., Chen, L. C., Yuille, A.: Detectors: detecting objects with recursive feature pyramid and switchable atrous convolution. In: 34th IEEE Conference on Computer Vision and Pattern Recognition, pp. 10213–10224. IEEE Press, Online (2021)

    Google Scholar 

  2. Tan, Z., Wang, J., Sun, X., Lin, M., Li, H.: Giraffedet: a heavy-neck paradigm for object detection. In: 10th International Conference on Learning Representations. Elsevier Press, Online (2022)

    Google Scholar 

  3. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: 16th IEEE International Conference on Computer Vision, pp. 2980–2988. IEEE Press, Venice (2017)

    Google Scholar 

  4. Li, F., et al.: Lite detr: an interleaved multi-scale encoder for efficient detr. In: 36th IEEE Conference on Computer Vision and Pattern Recognition. IEEE Press, Vancouver (2023)

    Google Scholar 

  5. Vaswani, A., et al.: Attention is all you need. Adv. Neural. Inf. Process. 30 (2017)

    Google Scholar 

  6. Lee, H., Kim, H.E., Nam, H.: Srm: a style-based recalibration module for convolutional neural networks. In: 17th IEEE International Conference on Computer Vision, pp. 1854–1862. IEEE Press, Seoul (2019)

    Google Scholar 

  7. Deng, S., Liang, Z., Sun, L., Jia, K.: Vista: Boosting 3d object detection via dual cross-view spatial attention. In: 35th IEEE Conference on Computer Vision and Pattern Recognition, pp. 8448–8457. IEEE Press, New Orleans (2022)

    Google Scholar 

  8. Guo, M.H., Lu, C.Z., Hou, Q., Liu, Z.N., Cheng, M.M., Hu, S.M.: SegNeXt: rethinking convolutional attention design for semantic segmentation. In: 16th Advances in Neural Information Processing Systems. MIT Press, New Orleans (2022)

    Google Scholar 

  9. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: 31th IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141. IEEE Press, Salt Lake City (2018)

    Google Scholar 

  10. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)

    Article  Google Scholar 

  11. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)

  12. Ghiasi, G., Lin, T.Y., Le, Q.V.: Dropblock: A regularization method for convolutional networks. Adv. Neural. Inf. Process. 31 (2018)

    Google Scholar 

  13. Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-IoU loss: Faster and better learning for bounding box regression. In: 34th AAAI Conference on Artificial Intelligence, vol. 34, no. 07, pp. 12993–13000. AAAI Press, New York City (2020)

    Google Scholar 

  14. Misra, D.: Mish: a self regularized non-monotonic activation function. arXiv preprint arXiv:1908.08681 (2019)

  15. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)

    Article  Google Scholar 

  16. Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: Centernet: keypoint triplets for object detection. In: 17th IEEE International Conference on Computer Vision, pp. 6569–6578. IEEE Press, Seoul (2019)

    Google Scholar 

  17. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: single shot multibox detector. In: 14th European Conference on Computer Vision, pp. 21–37. Springer Press, Amsterdam (2016)

    Google Scholar 

  18. Zhao, Q., et al.: M2det: a single-shot object detector based on multi-level feature pyramid network. In: 33th AAAI Conference on Artificial Intelligence, vol. 33, no. 1, pp. 9259–9266. AAAI Press, Hawaii (2019)

    Google Scholar 

  19. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

  20. Kong, T., Sun, F., Liu, H., Jiang, Y., Li, L., Shi, J.: Foveabox: beyond anchor-based object detection. IEEE Trans. Image Process. 29, 7389–7398 (2020)

    Article  MATH  Google Scholar 

  21. Cao, Y., Chen, K., Loy, C.C., Lin, D.: Prime sample attention in object detection. In: 33th IEEE Conference on Computer Vision and Pattern Recognition, pp. 11583–11591. IEEE Press, Seattle (2020)

    Google Scholar 

  22. Tian, Z., Shen, C., Chen, H., He, T.: Fcos: fully convolutional one-stage object detection. In: 17th IEEE International Conference on Computer Vision, pp. 9627–9636. IEEE Press, Seoul (2019)

    Google Scholar 

  23. Chen, Q., Wang, Y., Yang, T., Zhang, X., Cheng, J., Sun, J.: You only look one-level feature. In: 34th IEEE Conference on Computer Vision and Pattern Recognition, pp. 13039–13048. IEEE Press, Online (2021)

    Google Scholar 

  24. Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271. IEEE Press, Hawaii (2017)

    Google Scholar 

  25. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: 16th IEEE International Conference on Computer Vision, pp. 2961–2969. IEEE Press, Venice (2017)

    Google Scholar 

Download references

Acknowledgment

This work was supported by the Natural Science Foundation of Henan under Grant 232300421023.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongsheng Dong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shen, L., Dong, Y., Pei, Y., Yang, H., Zheng, L., Ma, J. (2023). One-Dimensional Feature Supervision Network for Object Detection. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science(), vol 14090. Springer, Singapore. https://doi.org/10.1007/978-981-99-4761-4_13

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-4761-4_13

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-4760-7

  • Online ISBN: 978-981-99-4761-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics