Abstract
The surge in Internet of Things usage has raised security breaches within the IoT ecosystem. Consequently, there is a pressing need to deploy robust Intrusion Detection Systems (IDSs) to safeguard IoT environments. This paper proposes a framework designed to establish stringent decision boundaries for effective attack detection, leveraging two prevalent datasets: CICIDS2017 and EDGE-IIOT. These datasets exhibit imbalanced class distributions and encompass numerous features with distinct characteristics. To address the class imbalance, the framework employs sampling techniques such as the synthetic minority oversampling technique with a genetic algorithm (GA-SMOTE) and with particle swarm optimization (SMOTE-PSO) along with random undersampling (RUS). The proposed framework utilizes tree-based learning algorithms, Decision Tree, Random Forest, and XGBoost, to identify cyberattacks and associated anomalies within the constrained IoT landscape. Feature selection is performed using the Boruta and WOA algorithms, and pruning algorithms are used to optimize the complexity of the model. The efficacy of the framework is evaluated using standard metrics on both workstations and Raspberry Pi boards to demonstrate its effectiveness on constrained IoT devices. The evaluation results demonstrate that the proposed model achieves a remarkable accuracy of 99.99% in identifying cyberattacks and related anomalies, exceeding the performance of existing baseline models in the CICIDS2017 dataset. It also obtains a high accuracy of 99.5% on EDGE-IIOT dataset. Furthermore, the framework shows promising results in terms of memory usage and execution time, achieving the best performance of 3.07 MB of memory usage and 4.26 s of execution time for the CICIDS2017 dataset and 1.93 MB of memory usage and 4.09 s of execution time for the EDGE-IIOT dataset when implemented on Raspberry Pi boards.










Similar content being viewed by others
Data availability
The datasets used in the current work are publicly available.
References
Khalid, L., Khalid, L.: Internet of things (IoT). In: Software Architecture for Business, pp. 107–127. Springer, Cham (2020)
Maalem Lahcen, R.A., Caulkins, B., Mohapatra, R., Kumar, M.: Review and insight on the behavioral aspects of cybersecurity. Cybersecurity 3, 1–18 (2020)
Zhang, J., Luo, C., Carpenter, M., Min, G.: Federated learning for distributed IIoT intrusion detection using transfer approaches. IEEE Trans. Ind. Inf. (2022). https://doi.org/10.1109/tii.2022.3216575
Kilincer, I.F., Ertam, F., Sengur, A.: Machine learning methods for cyber security intrusion detection: datasets and comparative study. Comput. Netw. 188, 107840 (2021)
Shaer, I., Nikan, S., Shami, A.: Efficient transformer-based hyper-parameter optimization for resource-constrained IoT environments. arXiv preprint (2024). arXiv:2403.12237
Kumar, A., Abhishek, K., Ghalib, M.R., Shankar, A., Cheng, X.: Intrusion detection and prevention system for an IoT environment. Digit. Commun. Netw. 8, 540–551 (2022)
Nesa, N., Ghosh, T., Banerjee, I.: IGRM: improved grey relational model and its ensembles for occupancy sensing in internet of things applications. ACM Trans. Knowl. Discov. Data (TKDD) 12, 1–23 (2018)
Mishra, S., Kshirsagar, V., Dwivedula, R., Hota, C.: Attention-based BI-LSTM for anomaly detection on time-series data. In: International Conference on Artificial Neural Networks, pp. 129–140 Springer (2021)
Mishra, S., Balan, R., Shibu, A., Hota, C.: Real-time probabilistic approach for traffic prediction on IoT data streams. In: Neural Information Processing: 27th International Conference, ICONIP 2020, Bangkok, Thailand, 18–22 November 2020, Proceedings, Part V, vol. 27, pp. 633–641. Springer (2020)
Yang, L., Shami, A.: A lightweight concept drift detection and adaptation framework for IoT data streams. IEEE Internet Things Mag. 4, 96–101 (2021)
Yulianto, A., Sukarno, P., Suwastika, N.A.: Improving AdaBoost-based intrusion detection system (IDS) performance on CIC IDS 2017 dataset. J. Phys. Conf. Ser. 1192, 012018 (2019)
Yang, L., Manias, D.M., Shami, A.: PWPAE: an ensemble framework for concept drift adaptation in IoT data streams. In: 2021 IEEE Global Communications Conference (GLOBECOM), pp. 01–06. IEEE (2021)
Alomari, D., Anis, F., Alabdullatif, M., Aljamaan, H.: A survey on botnets attack detection utilizing machine and deep learning models. In: Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering, pp. 493–498 (2023)
Zolanvari, M., Teixeira, M.A., Jain, R.: Effect of imbalanced datasets on security of industrial iot using machine learning. In: 2018 IEEE International Conference on Intelligence and Security Informatics (ISI), pp. 112–117. IEEE (2018)
Balla, A., Habaebi, M.H., Elsheikh, E.A., Islam, M.R., Suliman, F.: The effect of dataset imbalance on the performance of SCADA intrusion detection systems. Sensors 23, 758 (2023)
Dina, A.S., Siddique, A., Manivannan, D.: Effect of balancing data using synthetic data on the performance of machine learning classifiers for intrusion detection in computer networks. IEEE Access 10, 96731–96747 (2022)
Sharma, S., Gosain, A., Jain, S.: A review of the oversampling techniques in class imbalance problem. In: International Conference on Innovative Computing and Communications: Proceedings of ICICC 2021, vol. 1, pp. 459–472. Springer (2022)
Devi, D., Biswas, S.K., Purkayastha, B.: A review on solution to class imbalance problem: undersampling approaches. In: 2020 International Conference on Computational Performance Evaluation (ComPE), pp. 626–631. IEEE (2020)
Liu, J., Gao, Y., Hu, F.: A fast network intrusion detection system using adaptive synthetic oversampling and LIGHTGBM. Comput. Secur. 106, 102289 (2021)
Damtew, Y.G., Chen, H.: SMMO-COFS: synthetic multi-minority oversampling with collaborative feature selection for network intrusion detection system. Int. J. Comput. Intell. Syst. 16, 12 (2023)
Miah, M. O., Khan, S. S., Shatabda, S., Farid, D. M.: Improving detection accuracy for imbalanced network intrusion classification using cluster-based under-sampling with random forests. In: 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), pp. 1–5. IEEE (2019)
Kaur, R., Kumar, G., Kumar, K.: A comparative study of feature selection techniques for intrusion detection. In: 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom), pp. 2120–2124. IEEE (2015)
Thakkar, A., Lohiya, R.: A survey on intrusion detection system: feature selection, model, performance measures, application perspective, challenges, and future research directions. Artif. Intell. Rev. 55, 453–563 (2022)
Elmasri, T., Samir, N., Mashaly, M., Atef, Y.: Evaluation of CICIDS2017 with qualitative comparison of machine learning algorithm. In: 2020 IEEE Cloud Summit, pp. 46–51. IEEE (2020)
Maldonado, J., Riff, M.C., Neveu, B.: A review of recent approaches on wrapper feature selection for intrusion detection. Expert Syst. Appl. 198, 116822 (2022)
Stiawan, D., Idris, M.Y.B., Bamhdi, A.M., Budiarto, R., et al.: CICIDS-2017 dataset feature analysis with information gain for anomaly detection. IEEE Access 8, 132911–132921 (2020)
Gu, J., Lu, S.: An effective intrusion detection approach using SVM with Naïve bayes feature embedding. Comput. Secur. 103, 102–158 (2021)
Sah, G., Banerjee, S., Singh, S.: Intrusion detection system over real-time data traffic using machine learning methods with feature selection approaches. Int. J. Inf. Secur. 22, 1–27 (2023)
Al Nuaimi, T., et al.: A comparative evaluation of intrusion detection systems on the Edge-IIoT-2022 dataset. Intell. Syst. Appl. 20, 200298 (2023)
Liu, L., Engelen, G., Lynar, T., Essam, D., Joosen, W.: Error prevalence in NIDS datasets: a case study on CIC-IDS-2017 and CSE-CIC-IDS-2018. In: 2022 IEEE Conference on Communications and Network Security (CNS), pp. 254–262. IEEE (2022)
Ferrag, M.A., Friha, O., Hamouda, D., Maglaras, L., Janicke, H.: Edge-IIoTset: a new comprehensive realistic cyber security dataset of IoT and IIoT applications for centralized and federated learning. IEEE Access 10, 40281–40306 (2022)
Rajagopal, S., Kundapur, P.P., Hareesha, K.S.: A stacking ensemble for network intrusion detection using heterogeneous datasets. Secur. Commun. Netw. 2020, 1–9 (2020)
Torres, F. R., Carrasco-Ochoa, J. A., Martínez-Trinidad, J. F.: SMOTE-D a deterministic version of smote. In: Pattern Recognition: 8th Mexican Conference, MCPR 2016, Guanajuato, Mexico, 22–25 June 2016. Proceedings, vol. 8, pp. 177–188. Springer (2016)
Borowska, K., Stepaniuk, J.: Imbalanced data classification: a novel re-sampling approach combining versatile improved smote and rough sets. In: Computer Information Systems and Industrial Management: 15th IFIP TC8 International Conference, CISIM 2016, Vilnius, Lithuania, 14–16 September 2016, Proceedings, vol. 15, pp. 31–42. Springer (2016)
Elhassan, T., Aljurf, M.: Classification of imbalance data using Tomek Link (T-Link) combined with random under-sampling (RUS) as a data reduction method. Glob. J. Technol. Optim. 1, 2016 (2016)
Jemili, F., Meddeb, R., Korbaa, O.: Intrusion detection based on ensemble learning for big data classification. Clust. Comput. 27, 3771–3798 (2024)
Awotunde, J.B., et al.: An ensemble tree-based model for intrusion detection in industrial internet of things networks. Appl. Sci. 13, 2479 (2023)
Yang, L., Moubayed, A., Hamieh, I., Shami, A.: Tree-based intelligent intrusion detection system in internet of vehicles. In: 2019 IEEE Global Communications conference (GLOBECOM), pp. 1–6. IEEE (2019)
Arnaboldi, V., Passarella, A., Conti, M., Dunbar, R.: Evolutionary dynamics in Twitter EGO networks. In: Online Social Networks, pp. 75–92. Elsevier (2015)
Tang, R., Zhang, X.: Cart decision tree combined with Boruta feature selection for medical data classification. In: 2020 5th IEEE International Conference on Big Data Analytics (ICBDA), pp. 80–84. IEEE (2020)
Farhana, N., Firdaus, A., Darmawan, M.F., Ab Razak, M.F.: Evaluation of Boruta algorithm in DDOS detection. Egyp. Inf. J. 24, 27–42 (2023)
Mirjalili, S., Lewis, A.: The Whale Optimization Algorithm. Adv. Eng. Softw. 95, 51–67 (2016)
Chakraborty, S., Saha, A.K., Sharma, S., Chakraborty, R., Debnath, S.: A hybrid whale optimization algorithm for global optimization. J. Ambient. Intell. Humaniz. Comput. 14, 431–467 (2023)
Vijayanand, R., Devaraj, D.: A novel feature selection method using whale optimization algorithm and genetic operators for intrusion detection system in wireless mesh network. IEEE Access 8, 56847–56854 (2020)
Emary, E., Zawbaa, H.M., Hassanien, A.E.: Binary ant lion approaches for feature selection. Neurocomputing 213, 54–65 (2016)
Bernard, S., Heutte, L., Adam, S.: On the selection of decision trees in random forests. In: 2009 International Joint Conference on Neural Networks, pp. 302–307. IEEE (2009)
Dheenadayalan, K., Srinivasaraghavan, G., Muralidhara, V.: Pruning a random forest by learning a learning algorithm. In: Machine Learning and Data Mining in Pattern Recognition: 12th International Conference, MLDM 2016, New York, NY, USA, 16–21 July 2016, Proceedings, pp. 516–529. Springer (2016)
Mohammed, A.M., Onieva, E., Woźniak, M., Martinez-Munoz, G.: An analysis of heuristic metrics for classifier ensemble pruning based on ordered aggregation. Pattern Recogn. 124, 108493 (2022)
Shapley, L.S.: A Value for n-Person Games. Princeton, Princeton University Press (1953)
Lundberg, S. M., Lee, S.-I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Candanedo, L.M., Feldheim, V., Deramaix, D.: Data driven prediction models of energy use of appliances in a low-energy house. Energy Build. 140, 81–97 (2017)
Aliabadi, M.S., Jalalian, A.: Detection of attacks in the internet of things with the feature selection approach based on the whale optimization algorithm and learning by majority voting. Res. Square (2023). https://doi.org/10.21203/rs.3.rs-2424464/v2
Mafarja, M., et al.: Augmented whale feature selection for IoT attacks: structure, analysis and applications. Futur. Gener. Comput. Syst. 112, 18–40 (2020)
Ramosaj, B., Pauly, M.: Consistent estimation of residual variance with random forest out-of-bag errors. Statistics & Probability Letters 151, 49–57 (2019)
Biswas, P., Samanta, T.: Anomaly detection using ensemble random forest in wireless sensor network. Int. J. Inf. Technol. 13, 2043–2052 (2021)
Gomes, H. M., Read, J., Bifet, A.: Streaming random patches for evolving data stream classification. In: 2019 IEEE international conference on data mining (ICDM), pp. 240–249. IEEE (2019)
De Souza, C.A., Westphall, C.B., Machado, R.B., Sobral, J.B.M., dos Santos Vieira, G.: Hybrid approach to intrusion detection in FOG-based IoT environments. Comput. Netw. 180, 107417 (2020)
Roy, S., Li, J., Choi, B.-J., Bai, Y.: A lightweight supervised intrusion detection mechanism for IoT networks. Futur. Gener. Comput. Syst. 127, 276–285 (2022)
Haider, S., et al.: A deep CNN ensemble framework for efficient DDOS attack detection in software defined networks. IEEE Access 8, 53972–53983 (2020)
Hnamte, V., Hussain, J.: DCNNBILSTM: an efficient hybrid deep learning-based intrusion detection system. Telematics Inf. Rep. 10, 100053 (2023)
Acknowledgements
We thank the Center for Excellence in Internet of Things at VITAP University for providing the necessary support throughout the work.
Funding
This study has not received funding from any organization.
Author information
Authors and Affiliations
Contributions
SM, TA wrote the main manuscript text; SM, RS and RKS supervised the research work; SM, TA and AHS analysed and interpreted the data; SM, RS, SNM and RKS contributed analysis tools and edited the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declared that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mishra, S., Anithakumari, T., Sahay, R. et al. LIRAD: lightweight tree-based approaches on resource constrained IoT devices for attack detection. Cluster Comput 28, 140 (2025). https://doi.org/10.1007/s10586-024-04792-x
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10586-024-04792-x