Industrial data classification using stochastic configuration networks with self-attention learning features

Li, Weitao; Deng, Yali; Ding, Meishuang; Wang, Dianhui; Sun, Wei; Li, Qiyue

doi:10.1007/s00521-022-07657-9

Industrial data classification using stochastic configuration networks with self-attention learning features

Original Article
Published: 11 August 2022

Volume 34, pages 22047–22069, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Weitao Li¹,
Yali Deng¹,
Meishuang Ding²,
Dianhui Wang ORCID: orcid.org/0000-0002-5356-7268^3,4,
Wei Sun¹ &
…
Qiyue Li¹

735 Accesses
1 Altmetric
Explore all metrics

Abstract

Industrial data contain a lot of noisy information, which cannot be well suppressed in deep learning models. The current industrial data classification models are problematic in terms of feature incompleteness and inadequate self-adaptability, insufficient capacity for approximation of classifier and weak robustness. To this end, this paper proposes an intelligent classification method based on self-attention learning features and stochastic configuration networks (SCNs). This method imitates human cognitive mode to regulate feedback so as to achieve ensemble learning. In particular, firstly, at the feature extraction stage, a fused deep neural network model based on self-attention is constructed. It adopts a self-attention long short-term memory (LSTM) network and self-attention residual network with adaptive hierarchies and extracts the fault global temporal features and local spatial features of the industrial time-series dataset after noise suppression, respectively. Secondly, at the classifier design stage, the fused complete feature vectors are sent to SCNs with universal approximation capability to establish general classification criteria. Then, based on generalized error and entropy theory, the performance indexes for real-time evaluation of credibility of uncertainty classified results are established, and the adaptive adjustment mechanism of self-attention fusion networks for the network hierarchy is built to realize the self-optimization of multi-hierarchy complete features and their classification criteria. Finally, fuzzy integral is used to integrate the classified results of self-attention fusion network models with different hierarchies to improve the robustness of the classification model. Compared with other classification models, the proposed model performs better using rolling bearing fault dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-Scale Feature Fusion Fault Diagnosis Method Based on Attention Mechanism

ConvNeXt-BiGRU Rolling Bearing Fault Detection Based on Attention Mechanism

An Anti-interference Mechanical Fault Diagnosis Method Based on CNN and Attention Mechanism

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availibility

The datasets analysed during the current study are available in the the following public domain resources: [https://github.com/yyxyz/CaseWesternReserveUniversityData]

Abbreviations

${\varvec{x}}_t$ :: Input of the time-series sample at the moment t
${\varvec{c}}_t$ :: Cellular memory state at the moment t in LSTM cell structure
${\varvec{h}}_t$ :: Hidden state output at the moment t in LSTM cell structure
${\varvec{i}}_t$ :: Input gate at the moment t in LSTM cell structure
${\varvec{f}}_t$ :: Forgetting gate at the moment t in LSTM cell structure
${\varvec{o}}_t$ :: Output gate at the moment t in LSTM cell structure
$\tilde{{\varvec{c}}}_t$ :: Intermediate candidate vector of tanh layer in LSTM cell structure
${\varvec{x}}^{q=1}$ :: Time-series dataset input to the self-attention LSTM network ($q=1$)
${\varvec{h}}^{q=1}$ :: Hidden layer output of the self-attention LSTM network ($q=1$)
$\varvec{\varphi }_1^{q=1}$ :: Global contextual feature of the self-attention LSTM network ($q=1$)
$\varvec{\varphi }_2^{q=1}$ :: Intermediate feature by Sigmoid function of the self-attention LSTM network ($q=1$)
$\varvec{\tau }^{q=1}$ :: Time-scale threshold for ${\varvec{h}}^{q=1}$ of the self-attention LSTM network ($q=1$)
${\varvec{h}}_{\text{filter}}^{q=1}$ :: Filtered feature vector output for ${\varvec{h}}^{q=1}$
${\varvec{H}}^{q=1}$ :: Output of the self-attention LSTM network ($q=1$)
${\varvec{f}}^{q=1}$ :: Feature map output after two convolutions of the self-attention residual network ($q=1$)
${\varvec{f}}_1^{q=1}$ :: Global contextual feature of the self-attention residual network ($q=1$)
${\varvec{f}}_2^{q=1}$ :: Intermediate feature by Sigmoid function of the self-attention residual network ($q=1$)
$\varvec{\varepsilon }^{q=1}$ :: Channel threshold for ${\varvec{f}}^{q=1}$ of the self-attention residual network ($q=1$)
${\varvec{f}}_{\text{filter}}^{q=1}$ :: Filtered feature vector output for ${\varvec{f}}^{q=1}$
${\varvec{Y}}^{q=1}$ :: Output of self-attention residual network ($q=1$)
$\varvec{\lambda }_{L-1}$ :: Output of the $L-1{\text{th}}$ hidden node in SCNs
${\varvec{Z}}$ :: Fused feature vector of fusion deep network with self-attention for N samples
k :: Dimension of the fused feature vector ${\varvec{Z}}_j,j\in [1,N]$
$L_{\text{max}}$ :: Maximum hidden node number of SCNs
${\varvec{w}}_j$ :: Input weight of the $j{\text{th}}$ hidden node of SCNs
${\varvec{b}}_j$ :: Bias of the $j{\text{th}}$ hidden node of SCNs
p :: Number of fault categories
$\varvec{\beta }_j$ :: Output weight matrix of the $j{\text{th}}$ hidden node of SCNs
$g_j(\cdot )$ :: Activation function of the $j{\text{th}}$ hidden node of SCNs
${\varvec{e}}_{L-1}({\varvec{Z}})$ :: Error output of the $j{\text{th}}$ hidden node of SCNs
${\varvec{g}}_L({\varvec{Z}})$ :: Activation output of the $L{\text{th}}$ hidden node of SCNs for ${\varvec{Z}}$
${\varvec{G}}_L$ :: Output matrix of the $L{\text{th}}$ hidden layer of SCNs
$\varvec{\xi }_{L,a}$ :: Inequality constraint variables for hidden parameters of SCNs
$\varvec{\beta }^*$ :: Output weight matrix for L hidden nodes based on the least square method
${\varvec{G}}_L^{\dag }$ :: Moore–Penrose generalized inverse of matrix ${\varvec{G}}_L$
U :: Training time-series dataset of rolling bearing
$M_q$ :: Self-attention fusion network models with q hierarchy
${\varvec{Z}}_j^i$ :: Fusion feature of the $j{\text{th}}$ sample via $M_q$
$\tilde{{\varvec{Z}}}_j^i$ :: Fusion latent semantic feature for ${\varvec{Z}}_j^i$
X :: Any sample in U
$\tilde{{\varvec{C}}}$ :: Fusion latent semantic feature of X
$E_i$ :: Fusion latent semantic error entropy of X and $U_i$
${\varvec{S}}$ :: Covariance matrix of $\left[ \tilde{{\varvec{C}}}; \tilde{{\varvec{Z}}}^{i}\right] ^{\mathrm {T}}$
E :: Fusion latent semantic error entropy of X and U
m :: Feedback number
$q_0$ :: Initial network hierarchy of the self-attention fusion deep network
$q_{\text{max}}$ :: Maximum of the adaptive adjustment of the network hierarchy
thres:: Error threshold of SCNs
$\mu _{\text{max}}$ :: Iteration maximum of network training
num:: Sample number of U
$\gamma $ :: Sample credibility threshold
V :: Trusted sample dataset
A :: Fusion network model set
T :: Intermediate training dataset
v :: Fuzzy measure of fusion deep network model $A_i$
$X'$ :: Testing time-series dataset of rolling bearing
$\sigma $ :: Fuzzy integral of testing sample

References

Rai A, Upadhyay SH (2016) A review on signal processing techniques utilized in the fault diagnosis of rolling element bearings. Tribol Int 96:289–306
Article Google Scholar
Xu G, Liu M, Jiang Z, Shen W, Huang C (2020) Online fault diagnosis method based on transfer convolutional neural networks. IEEE Trans Instrum Measure 69(2):509–520
Article Google Scholar
Qu J, Zhang Z, Gong T (2016) A novel intelligent method for mechanical fault diagnosis based on dual-tree complex wavelet packet transform and multiple classifier fusion. Neurocomputing 171:837–853
Article Google Scholar
Gong W, Wang Y, Zhang M, Mihankhah E, Chen H, Wang D (2021) A fast anomaly diagnosis approach based on modified CNN and multi-sensor data fusion. IEEE Trans Ind Electr. https://doi.org/10.1109/TIE.2021.3135520
Article Google Scholar
Muruganatham B, Sanjith MA, Krishnakumar B, Satya Murty SAV (2013) Roller element bearing fault diagnosis using singular spectrum analysis. Mech Sys Sig Process 35(1–2):150–166
Article Google Scholar
Li B, Chow MY, Tipsuwan Y, Hung JC (2000) Neural-network-based motor rolling bearing fault diagnosis. IEEE Trans Ind Electr 47(5):1060–1069
Article Google Scholar
Maurya S, Singh V, Verma NK (2020) Condition monitoring of machines using fused features from EMD-based local energy with DNN. IEEE Sens J 20(15):8316–8327
Article Google Scholar
Harmouche J, Delpha C, Diallo D (2015) Incipient fault detection and diagnosis based on Kullback-Leibler divergence using principal component analysis: part II. Sig Process 109(1):334–344
Article Google Scholar
Guo Y, Wu X, Na J, Fung RF (2015) Incipient faults identification in gearbox by combining kurtogram and independent component analysis. Appl Mech Mater 764–765:309–313
Article Google Scholar
Yang Y, Yu DJ, Cheng JS (2006) A roller bearing fault diagnosis method based on EMD energy entropy and ANN. J Sound Vibrat 294(1–2):269–277
Google Scholar
Souza JDS, dos Santos MVL, Suzuki Bayma R, Amarante Mesquita AL (2021) Analysis of window size and statistical features for SVM-based fault diagnosis in bearings. IEEE Latin Am Trans 19(02):243–249
Article Google Scholar
Sun J, Yan C, Wen J (2018) Intelligent bearing fault dagnosis method combining compressed data acquisition and deep learning. IEEE Trans Instr Measur 67(1):185–195
Article Google Scholar
Zhao ZZ, Xu QS, Jia MP (2016) Improved shuffled frog leaping algorithm-based BP neural network and its application in bearing early fault diagnosis. Neur Comput Appl 27:375–385
Article Google Scholar
Yu L, Qu J, Gao F et al (2019) A novel hierarchical algorithm for bearing fault diagnosis based on stacked LSTM. Shock Vib 2019:1–11
Google Scholar
Pan H, He X, Tang S et al (2018) An improved bearing fault diagnosis method using one-dimensional CNN and LSTM. J Mech Eng 64(7–8):443–452
Google Scholar
Aljemely AH, Xuan JP, Azzawi OA, Jawad FKJ (2022) Intelligent fault diagnosis of rolling bearings based on LSTM with large margin nearest neighbor algorithm. Neur Comp Appl. https://doi.org/10.1007/s00521-022-07353-8
Article Google Scholar
Fan W, Zhou Q, Li J, Zhu Z (2018) A wavelet-based statistical approach for monitoring and diagnosis of compound faults with application to rolling bearings. IEEE Trans Auto Sci Eng 15(4):1563–1572
Article Google Scholar
Zhang W, Peng G, Li C (2017) Rolling element bearings fault intelligent diagnosis based on convolutional neural networks using raw sensing Signal. Adv Intell Infor Hid Multim Sig Process 64:77–84
Google Scholar
Xia M, Li T, Xu L et al (2017) Fault diagnosis for rotating machinery using multiple sensors and convolutional neural networks. IEEE/ASME Trans Mechatr 23(1):101–110
Article Google Scholar
Hoang DT, Kang HJ (2020) A motor current signal-based bearing fault diagnosis using deep learning and information fusion. IEEE Trans Instr Measur 69(6):3325–3333
Article Google Scholar
Li X, Zhang W, Ding Q (2018) Cross-domain fault diagnosis of rolling element bearings using deep generative neural networks. IEEE Trans Ind Electr 66(7):5525–5534
Article Google Scholar
Li Y (2021) Exploring real-time fault detection of high-speed train traction motor based on machine learning and wavelet analysis. Neur Comp Appl 34:9301–9314
Article Google Scholar
Cao R, Fang L, Lu T, He N (2021) Self-attention-based deep feature fusion for remote sensing scene classification. IEEE Geosci Remot Sens Lett 18(1):43–47
Article Google Scholar
Gao CX, Zhang N, Li YR, Bian F, Wan HYY (2022) Self-attention-based time-variant neural networks for multi-step time series forecasting. Neur Comput Appl 34:8737–8754
Article Google Scholar
Igelnik B, Pao YH (1995) Stochastic choice of basis functions in adaptive function approximation and the functional-link net. IEEE Trans Neural Netw 6(6):1320–1329
Article Google Scholar
Dai W, Li DP, Zhou P, Chai TY (2019) Stochastic confifiguration networks with block increments for data modeling in process industries. Infor Sci 484:367–386
Article Google Scholar
Li WT, Tao H, Li H, Chen KQ, Wang JP (2019) Greengage grading using stochastic configuration networks and a semi-supervised feedback mechanism. Infor Sci 488:1–12
Article Google Scholar
Lu J, Ding JL (2020) Mixed-distribution-based robust stochastic confifiguration networks for prediction interval construction. IEEE Trans Ind Infor 16(8):5099–5109
Article Google Scholar
Scardapane S, Wang D (2017) Randomness in neural networks: an overview. WIREs Data Min Knowl Discov 7(2):1–18
Google Scholar
Wang D (2016) Editorial: randomized algorithms for training neural networks. Infor. Sci. 126–128:364–365
MATH Google Scholar
Zhang Q, Li WT, Li H, Wang JP (2020) Self-blast state detection of glass insulators based on stochastic configuration networks and a feedback transfer learning mechanism. Info Sci 522:259–274
Article Google Scholar
El-Thalji I, Jantunen E (2015) A summary of fault modelling and predictive health monitoring of rolling element bearings. Mech Sys Sig Process 60–61:252–272
Article Google Scholar
Zhang W, Gao LP, Chuan HL et al (2017) A new deep learning model for fault diagnosis with good anti-noise and domain adaptation ability on raw vibration signals. Sensors 17(03):425–441
Article Google Scholar
Cheng Y, Yuan H, Liu H et al (2017) Fault diagnosis for rolling bearing based on SIFT-KPCA and SVM. Eng Comput 34(1):53–65
Article Google Scholar
Tan, J., Lu, W., An, J.: Fault diagnosis method study in roller bearing based on wavelet transform and stacked auto-encoder. 27th Chinese Control and Decision Conference, 4608-4613, (2015)
Ince T, Kiranyaz S, Eren L (2016) Real-time motor fault detection by 1-d convolutional neural networks. IEEE Trans Ind Electr 63(11):7067–7075
Article Google Scholar
Wei Z, Peng G, Li C (2017) Rolling element bearings fault intelligent diagnosis based on convolutional neural networks using raw sensing signal. Springer, Berlin 11:77–84
Yu L, Qu J, Gao F et al (2019) A novel hierarchical algorithm for bearing fault diagnosis based on stacked LSTM. Shock Vib 2019:1–10
Google Scholar
Pan H, He X, Tang S et al (2018) An improved bearing fault diagnosis method using one-dimensional CNN and LSTM. J Mech Eng 64(7/8):443–452
Google Scholar
Zhao R, Yan R, Wang J et al (2017) Learning to monitor machine health with convolutional bi-directional LSTM networks. Sensors 17(2):273
Article Google Scholar
Jafari H, Poshtan J (2019) Fault detection and isolation based on fuzzy-integral fusion approach. IET Sci Measur Tech 13(2):296–302
Article Google Scholar
Wu SL, Liu YT, Hsieh TY et al (2017) Fuzzy integral with particle swarm optimization for a motor-imagery-based brain Ccomputer interface. IEEE Trans Fuzzy Sys 25(1):21–28
Article Google Scholar
Li WT, Wang DH, Chai TY (2012) Flame image-based burning state recognition for sintering process of rotary kiln using heterogeneous features and fuzzy integral. IEEE Trans Ind Infor 8(4):780–790
Article Google Scholar
Cao, Y., Xu, J., Lin, S., et al.: GCNet: non-local networks meet squeeze-excitation networks and beyond, (2019) arXiv:1904.11492
Wang D, Li M (2017) Stochastic configuration networks: fundamentals and algorithms. IEEE Trans Cyber 47(10):3466–3479
Article Google Scholar
Wernecke SJ (2016) Two-dimensional maximum entropy reconstruction of radio brightness. Rad Sci 12(5):831–844
Article Google Scholar
Zhang W, Li C, Peng G et al (2018) A deep convolutional neural network with new training methods for bearing fault diagnosis under noisy environment and different working load. Mech Sys Sig Process 100:439–453
Article Google Scholar

Download references

Acknowledgements

This work is supported in part by grants from the National Natural Science Foundation of China (62173120, 52077049, 51877060), National Key R & D Program of China under Grant No. (2018AAA0100304), Anhui Provincial Natural Science Foundation (2008085UD04, 2108085UD07, 2108085UD11), and 111 Project No. (BP0719039).

Author information

Authors and Affiliations

Department of Electric Engineering and Automation, Hefei University of Technology, Hefei, 230009, China
Weitao Li, Yali Deng, Wei Sun & Qiyue Li
Employment and Entrepreneurship Guidance Center, Hefei Gongda Vocational and Technical College, Hefei, 231135, China
Meishuang Ding
Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
Dianhui Wang
Key Laboratory of Integrated Automation of Process Industry, Northeastern University, Shenyang, 110819, China
Dianhui Wang

Authors

Weitao Li
View author publications
You can also search for this author inPubMed Google Scholar
Yali Deng
View author publications
You can also search for this author inPubMed Google Scholar
Meishuang Ding
View author publications
You can also search for this author inPubMed Google Scholar
Dianhui Wang
View author publications
You can also search for this author inPubMed Google Scholar
Wei Sun
View author publications
You can also search for this author inPubMed Google Scholar
Qiyue Li
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Dianhui Wang.

Ethics declarations

Conflict of interest Statement

The authors declared that they have no conflicts of interest to this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, W., Deng, Y., Ding, M. et al. Industrial data classification using stochastic configuration networks with self-attention learning features. Neural Comput & Applic 34, 22047–22069 (2022). https://doi.org/10.1007/s00521-022-07657-9

Download citation

Received: 13 April 2022
Accepted: 18 July 2022
Published: 11 August 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s00521-022-07657-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Industrial data classification using stochastic configuration networks with self-attention learning features

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-Scale Feature Fusion Fault Diagnosis Method Based on Attention Mechanism

ConvNeXt-BiGRU Rolling Bearing Fault Detection Based on Attention Mechanism

An Anti-interference Mechanical Fault Diagnosis Method Based on CNN and Attention Mechanism

Explore related subjects

Data availibility

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest Statement

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now