One-Dimensional Feature Supervision Network for Object Detection

Shen, Longchao; Dong, Yongsheng; Pei, Yuanhua; Yang, Haotian; Zheng, Lintao; Ma, Jinwen

doi:10.1007/978-981-99-4761-4_13

Longchao Shen¹³,
Yongsheng Dong¹³,
Yuanhua Pei¹³,
Haotian Yang¹³,
Lintao Zheng¹³ &
…
Jinwen Ma¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14090))

Included in the following conference series:

International Conference on Intelligent Computing

1317 Accesses
3 Citations

Abstract

Self-attention mechanisms have been widely used in object detection tasks to distinguish the importance of different channels and reinforce important information in features, and also leads to the exciting results at all scales. However, most of the self-attentive mechanisms, as well as their variants, focus only on the channel dimension and thus easily ignore the wide and high dimensions of the feature map that play an important role in capturing local contextual information. To alleviate this problem, in this paper we propose an one-dimensional feature supervision network for object detection (1DSNet). Specifically, we first propose an one-dimensional feature supervision module (1DSM). It uses a lightweight one-dimensional feature vector to weight the features from the width and height perspectives, respectively, for jointly reinforcing the important information in the features. Moreover, in order to improve the representation of multi-scale feature context information, we construct a receptive field dilated pyramid pooling (RFD-SPP) that can obtain a larger field of view based on the spatial pyramid pooling. Finally, experimental results demonstrate that our proposed 1DSNet is effective and competitive when compared with some representative methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Qiao, S., Chen, L. C., Yuille, A.: Detectors: detecting objects with recursive feature pyramid and switchable atrous convolution. In: 34th IEEE Conference on Computer Vision and Pattern Recognition, pp. 10213–10224. IEEE Press, Online (2021)
Google Scholar
Tan, Z., Wang, J., Sun, X., Lin, M., Li, H.: Giraffedet: a heavy-neck paradigm for object detection. In: 10th International Conference on Learning Representations. Elsevier Press, Online (2022)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: 16th IEEE International Conference on Computer Vision, pp. 2980–2988. IEEE Press, Venice (2017)
Google Scholar
Li, F., et al.: Lite detr: an interleaved multi-scale encoder for efficient detr. In: 36th IEEE Conference on Computer Vision and Pattern Recognition. IEEE Press, Vancouver (2023)
Google Scholar
Vaswani, A., et al.: Attention is all you need. Adv. Neural. Inf. Process. 30 (2017)
Google Scholar
Lee, H., Kim, H.E., Nam, H.: Srm: a style-based recalibration module for convolutional neural networks. In: 17th IEEE International Conference on Computer Vision, pp. 1854–1862. IEEE Press, Seoul (2019)
Google Scholar
Deng, S., Liang, Z., Sun, L., Jia, K.: Vista: Boosting 3d object detection via dual cross-view spatial attention. In: 35th IEEE Conference on Computer Vision and Pattern Recognition, pp. 8448–8457. IEEE Press, New Orleans (2022)
Google Scholar
Guo, M.H., Lu, C.Z., Hou, Q., Liu, Z.N., Cheng, M.M., Hu, S.M.: SegNeXt: rethinking convolutional attention design for semantic segmentation. In: 16th Advances in Neural Information Processing Systems. MIT Press, New Orleans (2022)
Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: 31th IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141. IEEE Press, Salt Lake City (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)
Article Google Scholar
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Ghiasi, G., Lin, T.Y., Le, Q.V.: Dropblock: A regularization method for convolutional networks. Adv. Neural. Inf. Process. 31 (2018)
Google Scholar
Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-IoU loss: Faster and better learning for bounding box regression. In: 34th AAAI Conference on Artificial Intelligence, vol. 34, no. 07, pp. 12993–13000. AAAI Press, New York City (2020)
Google Scholar
Misra, D.: Mish: a self regularized non-monotonic activation function. arXiv preprint arXiv:1908.08681 (2019)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Article Google Scholar
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: Centernet: keypoint triplets for object detection. In: 17th IEEE International Conference on Computer Vision, pp. 6569–6578. IEEE Press, Seoul (2019)
Google Scholar
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: single shot multibox detector. In: 14th European Conference on Computer Vision, pp. 21–37. Springer Press, Amsterdam (2016)
Google Scholar
Zhao, Q., et al.: M2det: a single-shot object detector based on multi-level feature pyramid network. In: 33th AAAI Conference on Artificial Intelligence, vol. 33, no. 1, pp. 9259–9266. AAAI Press, Hawaii (2019)
Google Scholar
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Kong, T., Sun, F., Liu, H., Jiang, Y., Li, L., Shi, J.: Foveabox: beyond anchor-based object detection. IEEE Trans. Image Process. 29, 7389–7398 (2020)
Article MATH Google Scholar
Cao, Y., Chen, K., Loy, C.C., Lin, D.: Prime sample attention in object detection. In: 33th IEEE Conference on Computer Vision and Pattern Recognition, pp. 11583–11591. IEEE Press, Seattle (2020)
Google Scholar
Tian, Z., Shen, C., Chen, H., He, T.: Fcos: fully convolutional one-stage object detection. In: 17th IEEE International Conference on Computer Vision, pp. 9627–9636. IEEE Press, Seoul (2019)
Google Scholar
Chen, Q., Wang, Y., Yang, T., Zhang, X., Cheng, J., Sun, J.: You only look one-level feature. In: 34th IEEE Conference on Computer Vision and Pattern Recognition, pp. 13039–13048. IEEE Press, Online (2021)
Google Scholar
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271. IEEE Press, Hawaii (2017)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: 16th IEEE International Conference on Computer Vision, pp. 2961–2969. IEEE Press, Venice (2017)
Google Scholar

Download references

Acknowledgment

This work was supported by the Natural Science Foundation of Henan under Grant 232300421023.

Author information

Authors and Affiliations

School of Information Engineering, Henan University of Science and Technology, Luoyang, 471023, China
Longchao Shen, Yongsheng Dong, Yuanhua Pei, Haotian Yang & Lintao Zheng
Department of Information and Computational Sciences, School of Mathematical Sciences and LMAM, Peking University, Beijing, 100871, China
Jinwen Ma

Authors

Longchao Shen
View author publications
You can also search for this author in PubMed Google Scholar
Yongsheng Dong
View author publications
You can also search for this author in PubMed Google Scholar
Yuanhua Pei
View author publications
You can also search for this author in PubMed Google Scholar
Haotian Yang
View author publications
You can also search for this author in PubMed Google Scholar
Lintao Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Jinwen Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yongsheng Dong .

Editor information

Editors and Affiliations

Department of Computer Science, Eastern Institute of Technology, Zhejiang, China
De-Shuang Huang
University of Wollongong, North Wollongong, NSW, Australia
Prashan Premaratne
Zhengzhou University of Light Industry, Zhengzhou, China
Baohua Jin
Zhong Yuan University of Technology, Zhengzhou, China
Boyang Qu
University of Ulsan, Ulsan, Korea (Republic of)
Kang-Hyun Jo
Department of Computer Science, Liverpool John Moores University, Liverpool, UK
Abir Hussain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shen, L., Dong, Y., Pei, Y., Yang, H., Zheng, L., Ma, J. (2023). One-Dimensional Feature Supervision Network for Object Detection. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science(), vol 14090. Springer, Singapore. https://doi.org/10.1007/978-981-99-4761-4_13

Download citation

DOI: https://doi.org/10.1007/978-981-99-4761-4_13
Published: 31 July 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4760-7
Online ISBN: 978-981-99-4761-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics