Hand Gesture Recognition Using CBAM-RetinaNet

Suguna, Kota Yamini; Dutta, H Pallab Jyoti; Bhuyan, M. K.; Laskar, R. H.

doi:10.1007/978-3-031-11349-9_38

Kota Yamini Suguna¹⁰,
H Pallab Jyoti Dutta¹⁰,
M. K. Bhuyan¹⁰ &
…
R. H. Laskar¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1568))

Included in the following conference series:

International Conference on Computer Vision and Image Processing

1013 Accesses

Abstract

Hand gesture recognition has become very popular with the advancements in computer vision, and efficient detection of hand gestures is certainly the talk of the hour. It finds numerous worthy real-world applications such as human computer interaction, sign language interpretation, immersive gaming experience, robotics control systems, etc. This paper presents an end-to-end system for the recognition of hand gestures, based on an efficient object detection architecture-RetinaNet. Convolutional Block Attention Module is used to improve performance. The method has been tested on Ouhands dataset and achieved better recognition accuracies and real-time application. Results show that our CBAM-RetinaNet model is robust and efficient in recognition of hand gestures in complex backgrounds and different illumination variations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Integrated Deep Learning Structures for Hand Gesture Recognition

Static hand gesture recognition method based on the Vision Transformer

Article 02 March 2023

Hand Gesture Recognition Using Convolutional Neural Networks and Computer Vision

References

Santos, L., et al.: Dynamic gesture recognition using a smart glove in hand-assisted laparoscopic surgery. Technologies 6, 8 (2018)
Google Scholar
Sturman, D., Zeltzer, D.: A survey of glove-based input. IEEE Comput. Graphics Appl. 14(1), 30–39 (1994)
Article Google Scholar
Rehg, James M., Kanade, Takeo: Visual tracking of high DOF articulated structures: an application to human hand tracking. In: Eklundh, Jan-Olof. (ed.) ECCV 1994. LNCS, vol. 801, pp. 35–46. Springer, Heidelberg (1994). https://doi.org/10.1007/BFb0028333
Chapter Google Scholar
Zhan, F.: Hand gesture recognition with convolution neural networks. In: 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI), pp. 295–298 (2019)
Google Scholar
Dadashzadeh, A., Targhi, A., Tahmasbi, M., Mirmehdi, M.: HGRNet: a fusion network for hand gesture segmentation and recognition. IET Comput. Vis. 13(8), 700–707 (2019), the acceptance date for this record is provisional and based upon the month of publication for the article
Google Scholar
Jyoti Dutta, H.P. Sarma, D., Bhuyan, M., Laskar, R.H.: Semantic segmentation based hand gesture recognition using deep neural networks. In: 2020 National Conference on Communications (NCC), pp. 1–6, (2020)
Google Scholar
Du, K., Lin, X., Sun, Y., Ma, X.: CrossInfoNet: multi-task information sharing based hand pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
Google Scholar
Zhang, Q., Zhang, Y., Liu, Z.: A dynamic hand gesture recognition algorithm based on CSI and YOLOv3. J. Phys. Conf. Ser. 1267, 012055 (2019)
Google Scholar
Liu, P., Li, X., Cui, H., Li, S., Yafei, Y.: Hand gesture recognition based on single-shot multibox detector deep learning. Mobile Inf. Syst. 2019, 1–7 (2019)
Article Google Scholar
Sharma, S., Pallab Jyoti Dutta, H., Bhuyan, M., Laskar, R.: Hand gesture localization and classification by deep neural network for online text entry. In: 2020 IEEE Applied Signal Processing Conference (ASPCON), pp. 298–302 (2020)
Google Scholar
Girshick, R.: Fast R-CNN. In: IEEE International Conference on Computer Vision (ICCV), vol. 2015, pp. 1440–1448 (2015)
Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, Bastian, Matas, Jiri, Sebe, Nicu, Welling, Max (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2016, pp. 779–788 (2016)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Int. PP, 1 (2018)
Google Scholar
Pisharady, P., Vadakkepat, P., Loh, A.: Attention based detection and recognition of hand postures against complex backgrounds (2013)
Google Scholar
Woo, Sanghyun, Park, Jongchan, Lee, Joon-Young., Kweon, In So.: CBAM: convolutional block attention module. In: Ferrari, Vittorio, Hebert, Martial, Sminchisescu, Cristian, Weiss, Yair (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Chapter Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2016, pp. 770–778 (2016)
Google Scholar
Lin, T.Y., Dollr, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). vol. 2017, pp. 936–944 (2017)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 06 (2015)
Google Scholar
Matilainen, M., Sangi, P., Holappa, J., Silvn, O.: OUHANDS database for hand detection and pose recognition. In: 2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA), pp. 1–5 (2016)
Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: IEEE International Conference on Computer Vision (ICCV), vol. 2017, pp. 618–626 (2017)
Google Scholar

Download references

Acknowledgement

We acknowledge the Department of Biotechnology, Government of India for the financial support for the Project BT/COE/34/SP28408/2018.

Author information

Authors and Affiliations

Department of Electronics and Electrical Engineering, Indian Institute of Technology, Guwahati, 781039, Assam, India
Kota Yamini Suguna, H Pallab Jyoti Dutta & M. K. Bhuyan
Department of Electronics and Communication Engineering, National Institute of Technology, Silchar, 788010, Assam, India
R. H. Laskar

Authors

Kota Yamini Suguna
View author publications
You can also search for this author in PubMed Google Scholar
H Pallab Jyoti Dutta
View author publications
You can also search for this author in PubMed Google Scholar
M. K. Bhuyan
View author publications
You can also search for this author in PubMed Google Scholar
R. H. Laskar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. K. Bhuyan .

Editor information

Editors and Affiliations

Indian Institute of Technology Roorkee, Roorkee, India
Balasubramanian Raman
Indian Institute of Technology Ropar, Ropar, India
Subrahmanyam Murala
Jadavpur University, Kolkata, India
Ananda Chowdhury
Indian Institute of Technology Ropar, Ropar, India
Abhinav Dhall
Indian Institute of Technology Ropar, Ropar, India
Puneet Goyal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Suguna, K.Y., Dutta, H.P.J., Bhuyan, M.K., Laskar, R.H. (2022). Hand Gesture Recognition Using CBAM-RetinaNet. In: Raman, B., Murala, S., Chowdhury, A., Dhall, A., Goyal, P. (eds) Computer Vision and Image Processing. CVIP 2021. Communications in Computer and Information Science, vol 1568. Springer, Cham. https://doi.org/10.1007/978-3-031-11349-9_38

Download citation

DOI: https://doi.org/10.1007/978-3-031-11349-9_38
Published: 24 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-11348-2
Online ISBN: 978-3-031-11349-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics