Skip to main content
Log in

SAST: a suppressing ambiguity self-training framework for facial expression recognition

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Facial expression recognition (FER) suffers from insufficient label information, as human expressions are complex and diverse, with many expressions ambiguous. Using low-quality labels or low-quantity labels will aggravate ambiguity of model predictions and reduce the accuracy of FER. How to improve the robustness of FER to ambiguous data with insufficient information remains challenging. To this end, we propose the Suppressing Ambiguity Self-Training (SAST) framework which is the first attempt to address the problem of insufficient information both label quality and label quantity containing, simultaneously. Specifically, we design an Ambiguous Relative Label Usage (ARLU) strategy that mixes hard labels and soft labels to alleviate the information loss problem caused by hard labels. We also enhance the robustness of the model to ambiguous data by means of Self-Training Resampling (STR). We further use the landmarks and Patch Branch (PB) to enhance the ability of suppressing ambiguity. Experiments on RAF-DB, FERPlus, SFEW, and AffectNet datasets show that our SAST outperforms 6 semi-supervised methods with fewer annotations, and achieves competitive accuracy to State-Of-The-Art (SOTA) FER methods. Our code is available at https://github.com/Liuxww/SAST.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availibility Statements

Data openly available in a public repository. The data that support the findings of this study are openly available at: •RAF-DB: http://whdeng.cn/RAF/model1.html/data-set. •SFEW: https://cs.anu.edu.au/few/emotiw2015.html. •FERPlus: https://github.com/microsoft/FERPlus. •AffectNet: http://mohammadmahoor.com/affectnet/.

References

  1. Wu J, Yuan T, Zeng J, Gou F (2023) A medically assisted model for precise segmentation of osteosarcoma nuclei on pathological images. IEEE J Biomed Health Inform

  2. Guan P, Yu K, Wei W, Tan Y, Wu J (2023) Big data analytics on lung cancer diagnosis framework with deep learning. IEEE/ACM Trans Comput Biol Bioinform

  3. Wu J, Xiao P, Huang H, Gou F, Zhou Z, Dai Z (2022) An artificial intelligence multiprocessing scheme for the diagnosis of osteosarcoma mri images. IEEE J Biomed Health Inform 26(9):4656–4667

    Article  Google Scholar 

  4. Wu J, Guo Y, Gou F, Dai Z (2022) A medical assistant segmentation method for MRI images of osteosarcoma based on DecoupleSegNet. Int J Intell Syst 37(11):8436–8461

    Article  Google Scholar 

  5. Gou F, Wu J (2022) An attention–based ai–assisted segmentation system for osteosarcoma mri images. In: 2022 IEEE International conference on bioinformatics and biomedicine (BIBM), IEEE, pp 1539–1543

  6. Liang X, Xu L, Zhang W, Zhang Y, Liu J, Liu Z (2023) A convolution-transformer dual branch network for head-pose and occlusion facial expression recognition. Vis Comput 39(2277–229039):2637–2652

    Google Scholar 

  7. Zhang F, Zhang T, Mao Q, Xu C (2018) Joint pose and expression modeling for facial expression recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3359–3368

  8. Wang K, Peng X, Yang J, Lu S, Qiao Y (2020) Suppressing uncertainties for large-scale facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6897–6906

  9. She J, Hu Y, Shi H, Wang J, Shen Q, Mei T (2021) Dive into ambiguity: Latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6248–6257

  10. Chen S, Wang J, Chen Y, Shi Z, Geng X, Rui Y (2020) Label distribution learning on auxiliary label space graphs for facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13984–13993

  11. Zhang Y, Wang C, Deng W (2021) Relative uncertainty learning for facial expression recognition. Adv Neural Inf Process Syst 34:17616–17627

    Google Scholar 

  12. Y. Chen, J. Joo, Understanding and mitigating annotation bias in facial expression recognition. in: Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp 14980–14991

  13. Ruan D, Yan Y, Lai S, Chai Z, Shen C, Wang H (2021) Feature decomposition and reconstruction learning for effective facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7660–7669

  14. Xie S, Hu H, Wu Y (2019) Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition. Pattern Recogn 92:177–191

    Article  Google Scholar 

  15. Jiang P, Wan B, Wang Q, Wu J (2020) Fast and efficient facial expression recognition using a gabor convolutional network. IEEE Signal Process Lett 27:1954–1958

    Article  Google Scholar 

  16. Zhou L, Fan X, Ma Y, Tjahjadi T, Ye Q (2020) Uncertainty-aware cross-dataset facial expression recognition via regularized conditional alignment. In: Proceedings of the 28th ACM international conference on multimedia, MM ’20, Association for Computing Machinery, New York, USA, pp 2964–2972

  17. Wang L, Jia G, Jiang N, Wu H, Yang J (2022) Ease: robust facial expression recognition via emotion ambiguity-sensitive cooperative networks. in: Proceedings of the 30th ACM international conference on multimedia, MM ’22, Association for Computing Machinery, New York, USA, pp 218–227

  18. Li H, Wang N, Yang X, Wang X, Gao X (2022) Towards semi-supervised deep facial expression recognition with an adaptive confidence margin. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 4156–4165

  19. Florea C, Badea M, Florea L, Racoviteanu A, Vertan C (2020) Margin-mix: semi-supervised learning for face expression recognition. In: Computer vision – ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIII, Springer-Verlag, Berlin, Heidelberg, pp 1–17

  20. Berthelot D, Carlini N, Goodfellow I, Papernot N, Oliver A, Raffel C (2019) Mixmatch: a holistic approach to semi-supervised learning. arXiv:1905.02249

  21. Sohn K, Berthelot D, Li C–L, Zhang Z, Carlini N, Cubuk ED, Kurakin A, Zhang H, Raffel C (2020) Fixmatch: simplifying semi-supervised learning with consistency and confidence, arXiv:2001.07685

  22. Berthelot D, Carlini N, Cubuk ED, Kurakin A, Sohn K, Zhang H, Raffel C (2019) Remixmatch: semi-supervised learning with distribution alignment and augmentation anchoring, arXiv:1911.09785

  23. Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  24. Mohan K, Seal A, Krejcar O, Yazidi A (2021) Fer-net: facial expression recognition using deep neural net. Neural Comput Appl 33(15):9125–9136

  25. Mohan K, Seal A, Krejcar O, Yazidi A (2021) Facial expression recognition using local gravitational force descriptor-based deep convolution neural networks. IEEE Trans Instrum Meas 70:1–12

    Article  Google Scholar 

  26. Wang C, Wang S, Liang G (2019) Identity- and pose-robust facial expression recognition through adversarial feature learning. In: Proceedings of the 27th ACM international conference on multimedia, MM ’19, Association for Computing Machinery, New York, USA, pp 238–246

  27. Zou W, Zhang D, Lee D-J (2022) A new multi-feature fusion based convolutional neural network for facial expression recognition. Appl Intell 52(3):2918–2929

    Article  Google Scholar 

  28. Liu C, Liu X, Chen C, Wang Q (2023) Soft thresholding squeeze-and-excitation network for pose-invariant facial expression recognition. Vis Comput 39:2637–2652

    Article  Google Scholar 

  29. Liu S, Xu Y, Wan T, Kui X (2023) A dual-branch adaptive distribution fusion framework for real-world facial expression recognition. ICASSP 2023–2023 IEEE International conference on acoustics. Speech and Signal Processing (ICASSP), IEEE, pp 1–5

  30. Lee DH (2013) Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks

  31. Xie Q, Dai Z, Hovy EH, Luong T, Le Q (2020) Unsupervised data augmentation for consistency training. In: Neural information processing systems

  32. Cubuk ED, Zoph B, Shlens J, Le QV (2020) Randaugment: practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 702–703

  33. Wen Y, Zhang K, Li Z, Qiao Y (2016) A discriminative feature learning approach for deep face recognition. in: European conference on computer vision. Springer, pp 499–515

  34. Li S, Deng W, Du J (2017) Reliable crowdsourcing and deep locality–preserving learning for expression recognition in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2584–2593

  35. Barsoum E, Zhang C, Ferrer CC, Zhang Z (2016) Training deep networks for facial expression recognition with crowd-sourced label distribution. In: Proceedings of the 18th ACM international conference on multimodal interaction, pp 279–283

  36. Mollahosseini A, Hasani B, Mahoor MH (2017) Affectnet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans Affect Comput 10(1):18–31

  37. Dhall A, Goecke R, Lucey S, Gedeon T (2011) Static facial expression analysis in tough conditions: data, evaluation protocol and benchmark. In: 2011 IEEE international conference on computer vision workshops (ICCV workshops), IEEE, pp 2106–2112

  38. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32:8026–8037

  39. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  40. Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503

    Article  Google Scholar 

  41. Zeng D, Lin Z, Yan X, Liu Y, Wang F, Tang B (2022) Face2exp: combating data biases for facial expression recognition. In: 2022 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 20259–20268

  42. Liu H, Cai H, Lin Q, Li X, Xiao H (2022) Adaptive multilayer perceptual attention network for facial expression recognition. IEEE Trans Circ Syst Video Technol 32(9):6253–6266

    Article  Google Scholar 

  43. Lo L, Xie HX, Shuai H–H, Cheng W–H (2021) Facial chirality: using self-face reflection to learn discriminative features for facial expression recognition. In: 2021 IEEE International conference on multimedia and expo (ICME), IEEE, pp 1–6

  44. Ma F, Sun B, Li S (2021) Facial expression recognition with visual transformers and attentional selective fusion. IEEE Transactions on Affective Computing

  45. Liu C, Hirota K, Dai Y (2022) Patch attention convolutional vision transformer for facial expression recognition with occlusion. Inf Sci

  46. Guo Y, Huang J, Xiong M, Wang Z, Hu X, Wang J, Hijji M (2022) Facial expressions recognition with multi-region divided attention networks for smart education cloud applications. Neurocomputing 493:119–128

    Article  Google Scholar 

  47. Wang K, Peng X, Yang J, Meng D, Qiao Y (2020) Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans Image Process 29:4057–4069

    Article  Google Scholar 

  48. Sadeghi H, Raie A-A (2022) Histnet: histogram-based convolutional neural network with chi-squared deep metric learning for facial expression recognition. Inf Sci 608:472–488

  49. Han J, Du L, Ye X, Zhang L, Feng J (2022) The devil is in the face: exploiting harmonious representations for facial expression recognition. Neurocomputing 486:104–113

Download references

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62071384 and 62371399, the Key Research and Development Project of Shaanxi Province under Grant 2023-YBGY-239, and Natural Science Basic Research Plan in Shaanxi Province of China under Grant 2023-JC-YB-531.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhe Guo.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, Z., Wei, B., Liu, X. et al. SAST: a suppressing ambiguity self-training framework for facial expression recognition. Multimed Tools Appl 83, 56059–56076 (2024). https://doi.org/10.1007/s11042-023-17749-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-17749-w

Keywords