Abstract
Prognostics and health management (PHM) aims to offer comprehensive solutions for managing equipment health. Classifying the excavator operations plays an important role in measuring the lifetime, which is one of the tasks in PHM because the effect on the lifetime depends on the operations performed by the excavator. Several researchers have struggled with classifying the operations with either sensor or video data, but most of them have difficulties with the use of single modal data only, the surrounding environment, and the exclusive feature extraction for the data in different domains. In this paper, we propose a fusion network that classifies the excavator operations with multi-modal deep learning models. Trained are multiple classifiers with specific type of data, where feature extractors are reused to place at the front of the fusion network. The proposed fusion network combines a video-based model and a sensor-based model based on deep learning. To evaluate the performance of the proposed method, experiments are conducted with the data collected from real construction workplace. The proposed method yields the accuracy of 98.48% which is higher than conventional methods, and the multi-modal deep learning models can complement each other in terms of precision, recall, and F1-score.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sikorska, J.Z., Hodkiewicz, M., Ma, L.: Prognostic modeling options for remaining useful life estimation by industry. Mech. Syst. Sig. Process. 25(5), 1803–1836 (2011)
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)
Wu, Z., Jiang, Y.G., Wang, X., Ye, H., Xue, X.: Multi-stream multi-class fusion of deep networks for video classification. In: Proceedings of ACM on Multimedia Conference, pp. 791–800 (2016)
Carreira, J., Zisserman, A.: Quo vadis, action recognition? A new model and the kinetics dataset. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 4724–4733 (2017)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–445 (2015)
Sanchez, D., Melin, P., Castillo, O.: Optimization of modular granular neural networks using a firefly algorithm for human recognition. Eng. Appl. Artif. Intell. 64, 172–186 (2017)
Sanchez, D., Melin, P., Castillo, O.: A grey wolf optimizer for modular granular neural networks for human recognition. Comput. Intell. Neurosci. 2017, 1–26 (2017)
Melin, P., Sanchez, D.: Multi-objective optimization for modular granular neural networks applied to pattern recognition. Inf. Sci. 460, 594–610 (2018)
Dao, M., Nguyen, N.H., Nasrabadi, N.M., Tran, T.D.: Collaborative multi-sensor classification via sparsity-based representation. IEEE Trans. Sig. Process. 64(9), 2400–2415 (2016)
Chavez-Garcia, R.O., Aycard, O.: Multiple sensor fusion and classification for moving object detection and tracking. IEEE Trans. Intell. Transp. Syst. 17(2), 525–534 (2016)
Cao, J., Huang, W., Zhao, T., Wang, J., Wang, R.: An enhance excavation equipments classification algorithm based on acoustic spectrum dynamic feature. Multidimension. Syst. Sig. Process. 28(3), 921–943 (2017)
Cao, J., Zhao, T., Wang, J., Wang, R., Chen, Y.: Excavation equipment classification based on improved MFCC features and ELM. Neurocomputing 261, 231–241 (2017)
Choi, S.G., Cho, S.B.: Sensor information fusion by integrated AI to control public emotion in a cyber-physical environment. Sensors 18(11), 3767–3787 (2018)
Kim, J.Y., Cho, S.B.: Electric energy consumption prediction by deep learning with state explainable autoencoder. Energies 12(4), 739 (2019)
Donahue, J., Anne Hendricks, L., Huadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)
Zha, S., Luisier, F., Andrews, W., Srivastava, N., Salakhutdinov, R.: Exploiting image-trained CNN architectures for unconstrained video classification. arXiv preprint arXiv:1503.04144 (2015)
Ye, H., Wu, Z., Zhao, R.W., Wang, X., Jiang, Y.G., Xue, X.: Evaluating two-stream CNN for video classification. In: Proceedings of ACM on International Conference on Multimedia Retrieval, pp. 435–442 (2015)
Wu, Z., Wang, X., Jiang, Y.G., Ye, H., Xue, X.: Modeling spatial-temporal clues in a hybrid deep learning framework for video classification. In: Proceedings of the ACM International Conference on Multimedia, pp. 461–470 (2015)
Han, J., Zhang, D., Wen, S., Guo, L., Liu, T., Li, X.: Two-stage learning to predict human eye fixations via SDAEs. IEEE Trans. Cybern. 46(2), 487–498 (2016)
Kim, J.Y., Cho, S.B., Detecting intrusive malware with a hybrid generative deep learning model. In: International Conference on Intelligent Data Engineering and Automated Learning, pp. 499–507 (2018)
Hochreiter, S., Schmidhuber, J.: Long-short term memory. Neural Comput. 9(8), 1735–1780 (1997)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Xingjian, S.G.I., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.C.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Advances in Neural Information Processing Systems, pp. 802–810 (2015)
Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)
Acknowledgement
This work has been supported by a grant from Doosan infracore, Inc.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Kim, JY., Cho, SB. (2020). Classifying Excavator Operations with Fusion Network of Multi-modal Deep Learning Models. In: Martínez Álvarez, F., Troncoso Lora, A., Sáez Muñoz, J., Quintián, H., Corchado, E. (eds) 14th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2019). SOCO 2019. Advances in Intelligent Systems and Computing, vol 950. Springer, Cham. https://doi.org/10.1007/978-3-030-20055-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-20055-8_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20054-1
Online ISBN: 978-3-030-20055-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)