Skip to main content

Hand Gesture Recognition Using CBAM-RetinaNet

  • Conference paper
  • First Online:
Computer Vision and Image Processing (CVIP 2021)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1568))

Included in the following conference series:

  • 1013 Accesses

Abstract

Hand gesture recognition has become very popular with the advancements in computer vision, and efficient detection of hand gestures is certainly the talk of the hour. It finds numerous worthy real-world applications such as human computer interaction, sign language interpretation, immersive gaming experience, robotics control systems, etc. This paper presents an end-to-end system for the recognition of hand gestures, based on an efficient object detection architecture-RetinaNet. Convolutional Block Attention Module is used to improve performance. The method has been tested on Ouhands dataset and achieved better recognition accuracies and real-time application. Results show that our CBAM-RetinaNet model is robust and efficient in recognition of hand gestures in complex backgrounds and different illumination variations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Santos, L., et al.: Dynamic gesture recognition using a smart glove in hand-assisted laparoscopic surgery. Technologies 6, 8 (2018)

    Google Scholar 

  2. Sturman, D., Zeltzer, D.: A survey of glove-based input. IEEE Comput. Graphics Appl. 14(1), 30–39 (1994)

    Article  Google Scholar 

  3. Rehg, James M., Kanade, Takeo: Visual tracking of high DOF articulated structures: an application to human hand tracking. In: Eklundh, Jan-Olof. (ed.) ECCV 1994. LNCS, vol. 801, pp. 35–46. Springer, Heidelberg (1994). https://doi.org/10.1007/BFb0028333

    Chapter  Google Scholar 

  4. Zhan, F.: Hand gesture recognition with convolution neural networks. In: 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI), pp. 295–298 (2019)

    Google Scholar 

  5. Dadashzadeh, A., Targhi, A., Tahmasbi, M., Mirmehdi, M.: HGRNet: a fusion network for hand gesture segmentation and recognition. IET Comput. Vis. 13(8), 700–707 (2019), the acceptance date for this record is provisional and based upon the month of publication for the article

    Google Scholar 

  6. Jyoti Dutta, H.P. Sarma, D., Bhuyan, M., Laskar, R.H.: Semantic segmentation based hand gesture recognition using deep neural networks. In: 2020 National Conference on Communications (NCC), pp. 1–6, (2020)

    Google Scholar 

  7. Du, K., Lin, X., Sun, Y., Ma, X.: CrossInfoNet: multi-task information sharing based hand pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019

    Google Scholar 

  8. Zhang, Q., Zhang, Y., Liu, Z.: A dynamic hand gesture recognition algorithm based on CSI and YOLOv3. J. Phys. Conf. Ser. 1267, 012055 (2019)

    Google Scholar 

  9. Liu, P., Li, X., Cui, H., Li, S., Yafei, Y.: Hand gesture recognition based on single-shot multibox detector deep learning. Mobile Inf. Syst. 2019, 1–7 (2019)

    Article  Google Scholar 

  10. Sharma, S., Pallab Jyoti Dutta, H., Bhuyan, M., Laskar, R.: Hand gesture localization and classification by deep neural network for online text entry. In: 2020 IEEE Applied Signal Processing Conference (ASPCON), pp. 298–302 (2020)

    Google Scholar 

  11. Girshick, R.: Fast R-CNN. In: IEEE International Conference on Computer Vision (ICCV), vol. 2015, pp. 1440–1448 (2015)

    Google Scholar 

  12. Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, Bastian, Matas, Jiri, Sebe, Nicu, Welling, Max (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  13. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2016, pp. 779–788 (2016)

    Google Scholar 

  14. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Int. PP, 1 (2018)

    Google Scholar 

  15. Pisharady, P., Vadakkepat, P., Loh, A.: Attention based detection and recognition of hand postures against complex backgrounds (2013)

    Google Scholar 

  16. Woo, Sanghyun, Park, Jongchan, Lee, Joon-Young., Kweon, In So.: CBAM: convolutional block attention module. In: Ferrari, Vittorio, Hebert, Martial, Sminchisescu, Cristian, Weiss, Yair (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1

    Chapter  Google Scholar 

  17. He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2016, pp. 770–778 (2016)

    Google Scholar 

  18. Lin, T.Y., Dollr, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). vol. 2017, pp. 936–944 (2017)

    Google Scholar 

  19. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 06 (2015)

    Google Scholar 

  20. Matilainen, M., Sangi, P., Holappa, J., Silvn, O.: OUHANDS database for hand detection and pose recognition. In: 2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA), pp. 1–5 (2016)

    Google Scholar 

  21. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: IEEE International Conference on Computer Vision (ICCV), vol. 2017, pp. 618–626 (2017)

    Google Scholar 

Download references

Acknowledgement

We acknowledge the Department of Biotechnology, Government of India for the financial support for the Project BT/COE/34/SP28408/2018.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. K. Bhuyan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Suguna, K.Y., Dutta, H.P.J., Bhuyan, M.K., Laskar, R.H. (2022). Hand Gesture Recognition Using CBAM-RetinaNet. In: Raman, B., Murala, S., Chowdhury, A., Dhall, A., Goyal, P. (eds) Computer Vision and Image Processing. CVIP 2021. Communications in Computer and Information Science, vol 1568. Springer, Cham. https://doi.org/10.1007/978-3-031-11349-9_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-11349-9_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-11348-2

  • Online ISBN: 978-3-031-11349-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics