SAST: a suppressing ambiguity self-training framework for facial expression recognition

Guo, Zhe; Wei, Bingxin; Liu, Xuewen; Zhang, Zhibo; Liu, Shiya; Fan, Yangyu

doi:10.1007/s11042-023-17749-w

SAST: a suppressing ambiguity self-training framework for facial expression recognition

Published: 06 December 2023

Volume 83, pages 56059–56076, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Zhe Guo ORCID: orcid.org/0000-0001-8024-1434¹,
Bingxin Wei¹,
Xuewen Liu¹,
Zhibo Zhang¹,
Shiya Liu² &
…
Yangyu Fan¹

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Facial expression recognition (FER) suffers from insufficient label information, as human expressions are complex and diverse, with many expressions ambiguous. Using low-quality labels or low-quantity labels will aggravate ambiguity of model predictions and reduce the accuracy of FER. How to improve the robustness of FER to ambiguous data with insufficient information remains challenging. To this end, we propose the Suppressing Ambiguity Self-Training (SAST) framework which is the first attempt to address the problem of insufficient information both label quality and label quantity containing, simultaneously. Specifically, we design an Ambiguous Relative Label Usage (ARLU) strategy that mixes hard labels and soft labels to alleviate the information loss problem caused by hard labels. We also enhance the robustness of the model to ambiguous data by means of Self-Training Resampling (STR). We further use the landmarks and Patch Branch (PB) to enhance the ability of suppressing ambiguity. Experiments on RAF-DB, FERPlus, SFEW, and AffectNet datasets show that our SAST outperforms 6 semi-supervised methods with fewer annotations, and achieves competitive accuracy to State-Of-The-Art (SOTA) FER methods. Our code is available at https://github.com/Liuxww/SAST.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Soft Label Mining and Average Expression Anchoring for Facial Expression Recognition

ASM: Adaptive Sample Mining for In-The-Wild Facial Expression Recognition

Weighted contrastive learning using pseudo labels for facial expression recognition

Article 26 August 2022

Data Availibility Statements

Data openly available in a public repository. The data that support the findings of this study are openly available at: •RAF-DB: http://whdeng.cn/RAF/model1.html/data-set. •SFEW: https://cs.anu.edu.au/few/emotiw2015.html. •FERPlus: https://github.com/microsoft/FERPlus. •AffectNet: http://mohammadmahoor.com/affectnet/.

References

Wu J, Yuan T, Zeng J, Gou F (2023) A medically assisted model for precise segmentation of osteosarcoma nuclei on pathological images. IEEE J Biomed Health Inform
Guan P, Yu K, Wei W, Tan Y, Wu J (2023) Big data analytics on lung cancer diagnosis framework with deep learning. IEEE/ACM Trans Comput Biol Bioinform
Wu J, Xiao P, Huang H, Gou F, Zhou Z, Dai Z (2022) An artificial intelligence multiprocessing scheme for the diagnosis of osteosarcoma mri images. IEEE J Biomed Health Inform 26(9):4656–4667
Article Google Scholar
Wu J, Guo Y, Gou F, Dai Z (2022) A medical assistant segmentation method for MRI images of osteosarcoma based on DecoupleSegNet. Int J Intell Syst 37(11):8436–8461
Article Google Scholar
Gou F, Wu J (2022) An attention–based ai–assisted segmentation system for osteosarcoma mri images. In: 2022 IEEE International conference on bioinformatics and biomedicine (BIBM), IEEE, pp 1539–1543
Liang X, Xu L, Zhang W, Zhang Y, Liu J, Liu Z (2023) A convolution-transformer dual branch network for head-pose and occlusion facial expression recognition. Vis Comput 39(2277–229039):2637–2652
Google Scholar
Zhang F, Zhang T, Mao Q, Xu C (2018) Joint pose and expression modeling for facial expression recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3359–3368
Wang K, Peng X, Yang J, Lu S, Qiao Y (2020) Suppressing uncertainties for large-scale facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6897–6906
She J, Hu Y, Shi H, Wang J, Shen Q, Mei T (2021) Dive into ambiguity: Latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6248–6257
Chen S, Wang J, Chen Y, Shi Z, Geng X, Rui Y (2020) Label distribution learning on auxiliary label space graphs for facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13984–13993
Zhang Y, Wang C, Deng W (2021) Relative uncertainty learning for facial expression recognition. Adv Neural Inf Process Syst 34:17616–17627
Google Scholar
Y. Chen, J. Joo, Understanding and mitigating annotation bias in facial expression recognition. in: Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp 14980–14991
Ruan D, Yan Y, Lai S, Chai Z, Shen C, Wang H (2021) Feature decomposition and reconstruction learning for effective facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7660–7669
Xie S, Hu H, Wu Y (2019) Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition. Pattern Recogn 92:177–191
Article Google Scholar
Jiang P, Wan B, Wang Q, Wu J (2020) Fast and efficient facial expression recognition using a gabor convolutional network. IEEE Signal Process Lett 27:1954–1958
Article Google Scholar
Zhou L, Fan X, Ma Y, Tjahjadi T, Ye Q (2020) Uncertainty-aware cross-dataset facial expression recognition via regularized conditional alignment. In: Proceedings of the 28th ACM international conference on multimedia, MM ’20, Association for Computing Machinery, New York, USA, pp 2964–2972
Wang L, Jia G, Jiang N, Wu H, Yang J (2022) Ease: robust facial expression recognition via emotion ambiguity-sensitive cooperative networks. in: Proceedings of the 30th ACM international conference on multimedia, MM ’22, Association for Computing Machinery, New York, USA, pp 218–227
Li H, Wang N, Yang X, Wang X, Gao X (2022) Towards semi-supervised deep facial expression recognition with an adaptive confidence margin. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 4156–4165
Florea C, Badea M, Florea L, Racoviteanu A, Vertan C (2020) Margin-mix: semi-supervised learning for face expression recognition. In: Computer vision – ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIII, Springer-Verlag, Berlin, Heidelberg, pp 1–17
Berthelot D, Carlini N, Goodfellow I, Papernot N, Oliver A, Raffel C (2019) Mixmatch: a holistic approach to semi-supervised learning. arXiv:1905.02249
Sohn K, Berthelot D, Li C–L, Zhang Z, Carlini N, Cubuk ED, Kurakin A, Zhang H, Raffel C (2020) Fixmatch: simplifying semi-supervised learning with consistency and confidence, arXiv:2001.07685
Berthelot D, Carlini N, Cubuk ED, Kurakin A, Sohn K, Zhang H, Raffel C (2019) Remixmatch: semi-supervised learning with distribution alignment and augmentation anchoring, arXiv:1911.09785
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Article Google Scholar
Mohan K, Seal A, Krejcar O, Yazidi A (2021) Fer-net: facial expression recognition using deep neural net. Neural Comput Appl 33(15):9125–9136
Mohan K, Seal A, Krejcar O, Yazidi A (2021) Facial expression recognition using local gravitational force descriptor-based deep convolution neural networks. IEEE Trans Instrum Meas 70:1–12
Article Google Scholar
Wang C, Wang S, Liang G (2019) Identity- and pose-robust facial expression recognition through adversarial feature learning. In: Proceedings of the 27th ACM international conference on multimedia, MM ’19, Association for Computing Machinery, New York, USA, pp 238–246
Zou W, Zhang D, Lee D-J (2022) A new multi-feature fusion based convolutional neural network for facial expression recognition. Appl Intell 52(3):2918–2929
Article Google Scholar
Liu C, Liu X, Chen C, Wang Q (2023) Soft thresholding squeeze-and-excitation network for pose-invariant facial expression recognition. Vis Comput 39:2637–2652
Article Google Scholar
Liu S, Xu Y, Wan T, Kui X (2023) A dual-branch adaptive distribution fusion framework for real-world facial expression recognition. ICASSP 2023–2023 IEEE International conference on acoustics. Speech and Signal Processing (ICASSP), IEEE, pp 1–5
Lee DH (2013) Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks
Xie Q, Dai Z, Hovy EH, Luong T, Le Q (2020) Unsupervised data augmentation for consistency training. In: Neural information processing systems
Cubuk ED, Zoph B, Shlens J, Le QV (2020) Randaugment: practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 702–703
Wen Y, Zhang K, Li Z, Qiao Y (2016) A discriminative feature learning approach for deep face recognition. in: European conference on computer vision. Springer, pp 499–515
Li S, Deng W, Du J (2017) Reliable crowdsourcing and deep locality–preserving learning for expression recognition in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2584–2593
Barsoum E, Zhang C, Ferrer CC, Zhang Z (2016) Training deep networks for facial expression recognition with crowd-sourced label distribution. In: Proceedings of the 18th ACM international conference on multimodal interaction, pp 279–283
Mollahosseini A, Hasani B, Mahoor MH (2017) Affectnet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans Affect Comput 10(1):18–31
Dhall A, Goecke R, Lucey S, Gedeon T (2011) Static facial expression analysis in tough conditions: data, evaluation protocol and benchmark. In: 2011 IEEE international conference on computer vision workshops (ICCV workshops), IEEE, pp 2106–2112
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32:8026–8037
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503
Article Google Scholar
Zeng D, Lin Z, Yan X, Liu Y, Wang F, Tang B (2022) Face2exp: combating data biases for facial expression recognition. In: 2022 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 20259–20268
Liu H, Cai H, Lin Q, Li X, Xiao H (2022) Adaptive multilayer perceptual attention network for facial expression recognition. IEEE Trans Circ Syst Video Technol 32(9):6253–6266
Article Google Scholar
Lo L, Xie HX, Shuai H–H, Cheng W–H (2021) Facial chirality: using self-face reflection to learn discriminative features for facial expression recognition. In: 2021 IEEE International conference on multimedia and expo (ICME), IEEE, pp 1–6
Ma F, Sun B, Li S (2021) Facial expression recognition with visual transformers and attentional selective fusion. IEEE Transactions on Affective Computing
Liu C, Hirota K, Dai Y (2022) Patch attention convolutional vision transformer for facial expression recognition with occlusion. Inf Sci
Guo Y, Huang J, Xiong M, Wang Z, Hu X, Wang J, Hijji M (2022) Facial expressions recognition with multi-region divided attention networks for smart education cloud applications. Neurocomputing 493:119–128
Article Google Scholar
Wang K, Peng X, Yang J, Meng D, Qiao Y (2020) Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans Image Process 29:4057–4069
Article Google Scholar
Sadeghi H, Raie A-A (2022) Histnet: histogram-based convolutional neural network with chi-squared deep metric learning for facial expression recognition. Inf Sci 608:472–488
Han J, Du L, Ye X, Zhang L, Feng J (2022) The devil is in the face: exploiting harmonious representations for facial expression recognition. Neurocomputing 486:104–113

Download references

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62071384 and 62371399, the Key Research and Development Project of Shaanxi Province under Grant 2023-YBGY-239, and Natural Science Basic Research Plan in Shaanxi Province of China under Grant 2023-JC-YB-531.

Author information

Authors and Affiliations

School of Electronics and Information, Northwestern Polytechnical University, Xi’an, 710072, China
Zhe Guo, Bingxin Wei, Xuewen Liu, Zhibo Zhang & Yangyu Fan
Content Production Center of Virtual Reality, Beijing, 101318, China
Shiya Liu

Authors

Zhe Guo
View author publications
You can also search for this author inPubMed Google Scholar
Bingxin Wei
View author publications
You can also search for this author inPubMed Google Scholar
Xuewen Liu
View author publications
You can also search for this author inPubMed Google Scholar
Zhibo Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Shiya Liu
View author publications
You can also search for this author inPubMed Google Scholar
Yangyu Fan
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Zhe Guo.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Guo, Z., Wei, B., Liu, X. et al. SAST: a suppressing ambiguity self-training framework for facial expression recognition. Multimed Tools Appl 83, 56059–56076 (2024). https://doi.org/10.1007/s11042-023-17749-w

Download citation

Received: 08 August 2023
Revised: 28 October 2023
Accepted: 25 November 2023
Published: 06 December 2023
Issue Date: May 2024
DOI: https://doi.org/10.1007/s11042-023-17749-w

Keywords

Part of a collection:

Track 6: Computer Vision for Multimedia Applications

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SAST: a suppressing ambiguity self-training framework for facial expression recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Soft Label Mining and Average Expression Anchoring for Facial Expression Recognition

ASM: Adaptive Sample Mining for In-The-Wild Facial Expression Recognition

Weighted contrastive learning using pseudo labels for facial expression recognition

Data Availibility Statements

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now