Machine Learning Approaches for Anomaly Detection in Complex Systems: https://doi.org/10.5281/zenodo.19465217

Muhammad Umer Imran; Song Yiying; Syed Danyal Ali Naqvi; Marsad Rasheed; Hadin Khan; Syed Nouman Ali Shah

Authors

Muhammad Umer Imran Department of Clinical medicine, Nanchang University, China
Song Yiying Department of Data Science and Big Data Specialization, Shenyang University, Shenyang, China
Syed Danyal Ali Naqvi Department of Computer Science, COMSATS University, Islamabad, Pakistan
Marsad Rasheed Department of Computer Science, COMSATS University, Islamabad, Pakistan
Hadin Khan Department of Computer Science, COMSATS University, Islamabad, Pakistan
Syed Nouman Ali Shah School of Computer and IT, Beaconhouse National University, Lahore, Punjab, Pakistan

Keywords:

Anomaly Detection; Machine Learning; Complex Systems; Autoencoder; Xgboost; Auc-Roc; Statistical Comparison

Abstract

Detection of anomalies in complex systems has been challenging due to high dimensionality, nonlinear relationships, and extreme class imbalance. The study has conducted comparative, quantitative assessments of classical machine learning, deep learning, and supervised techniques for detecting anomalies using multivariate system data. The five most popular models, including Isolation Forest, One-Class Support Vector Machine, Local Outlier Factor, an autoencoder-based neural network, and Extreme Gradient Boosting (XGBoost), were systematically evaluated. Accuracy, precision, recall, F1-score, area under the receiver operating characteristic (ROC) curve (AUC-ROC), and false-positive and false-negative rates were used to evaluate model performance. Overall, the best performance was demonstrated by the supervised XGBoost model, which achieved 97.3% accuracy, an F1-score of 0.83, and an AUC-ROC of 0.96, with the lowest false-negative rate (3.1%). The auto-encoder, as one of these unsupervised methods, outperformed classical methods with a score of 95.6, F1-score of 0.75, AUC-ROC of 0.92, and equal error rates (false-positive rate: 4.8; false-negative rate: 4.4). Isolation Forest performed with moderate precision (AUC-ROC: 0.89), with One-Class SVM and the Local Outlier Factor having lower recall and a higher error rate. Statistical comparisons of AUC-ROC values using pairwise statistics revealed that the XGBoost and the auto-encoder were significantly better than both One-Class SVM and the Local outlier factor (p < 0.05). However, they were not significantly different from each other (p = 0.07). Generally, the findings quantitatively demonstrate the benefits of supervised learning when labeled data are available and underscore the success of deep autoencoder-based algorithms in unsupervised anomaly detection in complex systems.

Machine Learning Approaches for Anomaly Detection in Complex Systems

https://doi.org/10.5281/zenodo.19465217

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Similar Articles

Journal Information

Indexing

Flag Counter