Abstract
Hand gesture recognition has become very popular with the advancements in computer vision, and efficient detection of hand gestures is certainly the talk of the hour. It finds numerous worthy real-world applications such as human computer interaction, sign language interpretation, immersive gaming experience, robotics control systems, etc. This paper presents an end-to-end system for the recognition of hand gestures, based on an efficient object detection architecture-RetinaNet. Convolutional Block Attention Module is used to improve performance. The method has been tested on Ouhands dataset and achieved better recognition accuracies and real-time application. Results show that our CBAM-RetinaNet model is robust and efficient in recognition of hand gestures in complex backgrounds and different illumination variations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Santos, L., et al.: Dynamic gesture recognition using a smart glove in hand-assisted laparoscopic surgery. Technologies 6, 8 (2018)
Sturman, D., Zeltzer, D.: A survey of glove-based input. IEEE Comput. Graphics Appl. 14(1), 30–39 (1994)
Rehg, James M., Kanade, Takeo: Visual tracking of high DOF articulated structures: an application to human hand tracking. In: Eklundh, Jan-Olof. (ed.) ECCV 1994. LNCS, vol. 801, pp. 35–46. Springer, Heidelberg (1994). https://doi.org/10.1007/BFb0028333
Zhan, F.: Hand gesture recognition with convolution neural networks. In: 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI), pp. 295–298 (2019)
Dadashzadeh, A., Targhi, A., Tahmasbi, M., Mirmehdi, M.: HGRNet: a fusion network for hand gesture segmentation and recognition. IET Comput. Vis. 13(8), 700–707 (2019), the acceptance date for this record is provisional and based upon the month of publication for the article
Jyoti Dutta, H.P. Sarma, D., Bhuyan, M., Laskar, R.H.: Semantic segmentation based hand gesture recognition using deep neural networks. In: 2020 National Conference on Communications (NCC), pp. 1–6, (2020)
Du, K., Lin, X., Sun, Y., Ma, X.: CrossInfoNet: multi-task information sharing based hand pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
Zhang, Q., Zhang, Y., Liu, Z.: A dynamic hand gesture recognition algorithm based on CSI and YOLOv3. J. Phys. Conf. Ser. 1267, 012055 (2019)
Liu, P., Li, X., Cui, H., Li, S., Yafei, Y.: Hand gesture recognition based on single-shot multibox detector deep learning. Mobile Inf. Syst. 2019, 1–7 (2019)
Sharma, S., Pallab Jyoti Dutta, H., Bhuyan, M., Laskar, R.: Hand gesture localization and classification by deep neural network for online text entry. In: 2020 IEEE Applied Signal Processing Conference (ASPCON), pp. 298–302 (2020)
Girshick, R.: Fast R-CNN. In: IEEE International Conference on Computer Vision (ICCV), vol. 2015, pp. 1440–1448 (2015)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, Bastian, Matas, Jiri, Sebe, Nicu, Welling, Max (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2016, pp. 779–788 (2016)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Int. PP, 1 (2018)
Pisharady, P., Vadakkepat, P., Loh, A.: Attention based detection and recognition of hand postures against complex backgrounds (2013)
Woo, Sanghyun, Park, Jongchan, Lee, Joon-Young., Kweon, In So.: CBAM: convolutional block attention module. In: Ferrari, Vittorio, Hebert, Martial, Sminchisescu, Cristian, Weiss, Yair (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2016, pp. 770–778 (2016)
Lin, T.Y., Dollr, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). vol. 2017, pp. 936–944 (2017)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 06 (2015)
Matilainen, M., Sangi, P., Holappa, J., Silvn, O.: OUHANDS database for hand detection and pose recognition. In: 2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA), pp. 1–5 (2016)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: IEEE International Conference on Computer Vision (ICCV), vol. 2017, pp. 618–626 (2017)
Acknowledgement
We acknowledge the Department of Biotechnology, Government of India for the financial support for the Project BT/COE/34/SP28408/2018.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Suguna, K.Y., Dutta, H.P.J., Bhuyan, M.K., Laskar, R.H. (2022). Hand Gesture Recognition Using CBAM-RetinaNet. In: Raman, B., Murala, S., Chowdhury, A., Dhall, A., Goyal, P. (eds) Computer Vision and Image Processing. CVIP 2021. Communications in Computer and Information Science, vol 1568. Springer, Cham. https://doi.org/10.1007/978-3-031-11349-9_38
Download citation
DOI: https://doi.org/10.1007/978-3-031-11349-9_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-11348-2
Online ISBN: 978-3-031-11349-9
eBook Packages: Computer ScienceComputer Science (R0)