Anomaly Detection Techniques for Securing IoT Endpoints: A Machine Learning Approach

Premier Science > Anomaly Detection Techniques for Securing IoT Endpoints: A Machine Learning Approach

Kamran Asgarov
Department of Engineering Mathematics and Artificial Intelligence, Azerbaijan Technical University, Baku, Azerbaijan
Correspondence to: Kamran Asgarov, kasgarov88@gmail.com

DOI: https://doi.org/10.70389/PJS.100175

Additional information

Ethical approval: N/a
Consent: N/a
Funding: No industry funding
Conflicts of interest: N/a
Author contribution: Kamran Asgarov – Conceptualization, Writing – original draft, review and editing
Guarantor: Kamran Asgarov
Provenance and peer-review: Unsolicited and externally peer-reviewed
Data availability statement: The data that support the findings of this study are available on request from the corresponding author.

Keywords: Adaptive retraining mechanisms, Cross-validation procedure, False positive rate, Prediction latency, ROC curve, Resource consumption, Scalability test, Telemetry stream.

Peer Review
Received: 3 September 2025
Last revised: 30 October 2025
Accepted: 30 October 2025
Version accepted: 5
Published: 20 November 2025

Plain Language Summary Infographic

“Boxed infographic summarising an IoT anomaly detection study using the TON_IoT dataset and a hybrid CNN–RNN model, showing 92.1% detection accuracy, 3.7% false positives, and low latency, with sections on background, dataset, model architecture, and results.”

Abstract

Background: The primary goal of the research was to determine whether high detection accuracy could be achieved without significant computational overhead, using only publicly available data and modest simulation infrastructure.

Materials and Methods: The research was conducted using the open-access TON_IoT dataset, which contains over one million time-stamped records collected from various types of IoT devices. The study applied a hybrid machine learning architecture combining convolutional and recurrent layers with gating mechanisms. A five-fold cross-validation procedure was used to ensure the statistical reliability of the obtained results.

Results: This article presents the results of a study aimed at evaluating the effectiveness of machine learning methods for detecting anomalies in Internet of Things (IoT) telemetry streams under realistic conditions. The anomaly detection model demonstrated an average detection rate of 92.1%, a false positive rate of 3.7%, and a mean F1-score of 92.4%. In 85% of test cases, the detection latency remained below 1 s. The use of lightweight pre-processing, statistical filters, and synthetic data augmentation helped ensure robustness under conditions of class imbalance. The model retained high performance when applied to unseen data from different simulated regions, with only minimal fine-tuning required to restore accuracy.

Conclusion: The practical significance of the findings lies in the demonstrated feasibility of deploying anomaly detection systems for industrial IoT environments using only publicly available datasets and limited computational resources. The proposed approach can be implemented in monitoring platforms for early threat detection, predictive maintenance, and autonomous security control, particularly in sectors lacking access to high-end computing infrastructure or proprietary data.

Highlights

Combining convolutional layers, bidirectional long-short-term memory units and gated-linear mechanisms – successfully detected anomalies in industrial IoT telemetry.
Hybrid machine learning architecture integrating convolutional, recurrent, and gating components effectively addressed the core challenges of anomaly detection.
Fine-tuning sessions restored baseline detection rates, reinforcing the practical applicability of the architecture in dynamically changing environments.
The reproducibility of the full pipeline ensured the methodological transparency of the results and supported their future refinement.

Introduction

The rapid expansion of Internet of Things (IoT) technologies in industrial settings significantly increased the complexity of cybersecurity threats due to the distributed nature of endpoints, limited computational resources at the edge, and the rising number of potential attack surfaces.^1–3 Traditional rule-based systems and threshold detectors proved insufficient under variable production loads, network instability, and inconsistent device behaviour.^3–6 As the scale of IoT deployment grew, the need emerged for adaptive anomaly detection systems capable of identifying deviations in real time without imposing excessive computational overhead. Undetected anomalies in industrial IoT can lead to costly downtime, safety hazards, and significant financial losses, as compromised systems may disrupt production processes, damage critical infrastructure, or expose sensitive data to cyber threats.

Previous research confirmed the effectiveness of machine learning methods for anomaly detection in digital infrastructures. Ferrag et al.⁷ emphasized the advantages of deep neural networks, including autoencoders and recurrent architectures, for uncovering hidden threats in complex cybersecurity scenarios. However, the study was constrained by limited datasets or simulated environments, which prevented the validation of proposed models under realistic multi-node deployments. This raised concerns regarding the applicability and generalizability of their results. Efforts to balance detection latency with resource efficiency remained central to ongoing research. Konatham et al.⁸ introduced a hybrid architecture for industrial IoT edge devices that achieved over 95% detection accuracy but required further validation under geographically distributed and multi-device conditions. A similar limitation was noted in the work of Chen et al.⁹, where a combined XGBoost-LSTM model yielded promising results but was tested within a single operational region, restricting its extrapolation to other environments.

One of the key unresolved issues involved optimizing the trade-off between detection accuracy, response latency, and resource utilization on constrained hardware. This challenge was partially addressed by the creation of the open-access TON_IoT dataset by Moustafa et al.10, which integrated telemetry, network traffic, and operating system data to support testing in industrial IoT scenarios. The present study adopted this dataset as the foundation for experimentation.

Emeç and Ozcanhan⁵ proposed a hybrid deep learning framework combining convolutional and recurrent neural networks, demonstrating that spatial and temporal modelling of time-series data could enhance detection accuracy across public IoT datasets. Abusitta et al.¹¹ showed that transformer-based models preserved generalization across diverse device types and operating conditions. The importance of addressing rare and underrepresented anomalies was highlighted by Ullah and Mahmoud,¹² who proposed the use of conditional generative models to simulate rare attack events and improve the robustness of detection systems. Otoum et al.¹³ emphasized the importance of evaluating detection models on real or open-access data rather than solely in simulated settings, demonstrating that performance in real deployments consistently outperformed lab-based validation.

Despite these advancements, it remained uncertain whether a hybrid machine learning model could simultaneously achieve high detection accuracy, low response latency, and minimal resource usage using only public data and standard machine learning tools. For this reason, it was deemed necessary to develop and evaluate a new hybrid architecture designed to balance classification quality with deployment practicality. The model was built entirely on the TON_IoT dataset, validated through cross-validation, and tested for its ability to generalize without retraining across novel input distributions. This approach ensured complete reproducibility and removed the dependency on private datasets or high-cost industrial infrastructure. Furthermore, the use of a standardized open-source benchmark facilitated the objective comparison of results with existing anomaly detection frameworks. The research aimed to provide a scalable and transparent solution, ensuring compatibility with open-source datasets and standard infrastructure, and enabling adoption in environments with limited access to commercial monitoring hardware or proprietary telemetry.

Industrial IoT (IIoT) systems became the focus of this research due to their critical role in modern manufacturing, energy, transportation, and other industrial sectors, where downtime, system failures, or security breaches can result in significant financial losses, safety hazards, or operational disruptions. Unlike medical or domestic IoT systems, which typically involve lower stakes in terms of operational impact, IIoT systems are highly integrated into complex, real-time processes that require continuous monitoring and immediate detection of anomalies.

Materials and Methods

The study was conducted using the open-source TON_IoT dataset, developed specifically to support research on anomaly detection in industrial IoT environments. This dataset, as described by Moustafa et al.10 contains telemetry streams, network traffic, and operating system data collected from simulated industrial devices under realistic operational conditions, making it highly relevant for real-world industrial IoT environments. The dataset provides a rich multimodal structure, with over 1 million labelled time-stamped records reflecting power usage, temperature fluctuations, vibration signals, communication anomalies, and system-level behaviours. Its diverse set of features, including both operational and network metrics, allows for a comprehensive analysis of IoT system anomalies, closely mirroring the complexities found in industrial settings, such as variable load conditions, diverse device behaviours, and network instabilities. This relevance to real-world environments, coupled with its open access, makes the TON_IoT dataset an ideal foundation for evaluating anomaly detection models, as it provides both the necessary diversity in data and realistic operational scenarios for testing model performance across various industrial conditions.

To prepare the data for analysis, a structured preprocessing pipeline was applied. The raw telemetry was segmented into overlapping 60 s time windows, and from each window, 28 aggregated features were derived, including statistical moments (mean, variance, skewness), entropy, vibration intensity, power metrics, and packet delay indicators. A three-level modified Hampel filter was used to eliminate duplicate or corrupted signals, following the method proposed by Otoum et al.¹⁴, with the threshold dynamically adjusted to local signal variance. This approach reduced the missing-value ratio to 0.07% and removed over 98% of redundant messages while preserving short-term bursts and operational anomalies.

All features were normalised using z-score standardisation to harmonise scale differences between telemetry attributes. Given the scarcity of labelled anomalous samples, a lightweight synthetic augmentation strategy was implemented during training only, using the data-driven perturbation of minority-class windows to improve model robustness without distorting the validation distribution. The augmentation approach was inspired by the simulation of rare attack instances proposed by Ullah and Mahmoud.¹³ A stratified five-fold cross-validation protocol was adopted to assess performance reliably and prevent temporal or device-based data leakage. Each fold contained approximately 240,000 records, preserving the original class imbalance, with anomalies constituting less than 2.5% of the data. Stratification was performed by hashing session identifiers, ensuring that similar time windows did not appear in both training and validation sets. This design allowed consistent benchmarking across folds and reduced overfitting to isolated event sequences.

To develop the detection model, a hybrid architecture integrating convolutional neural networks (CNN), long short-term memory (LSTM) units, and gated linear units (GLU) was used, following the structural principles described by Emeç and Ozcanhan¹¹, and Abusitta et al.¹² The input was formatted as multivariate time-sequence matrices derived from the pre-processed telemetry windows. Bayesian optimisation was employed to select the optimal values for learning rate, kernel size, and dropout rate. Early stopping was used to terminate training when no further reduction in validation loss was observed.

Model performance was evaluated using multiple metrics, including detection rate, false positive rate, precision, recall, F1-score, and the area under the receiver operating characteristic curve (ROC AUC). Ground truth labels were taken directly from the TON_IoT dataset and supplemented with event metadata provided by the original authors. The F1-score was calculated as the harmonic mean of precision and recall, while ROC AUC served as a threshold-independent metric of classifier separability. To verify the statistical significance of the observed improvements, a paired two-tailed Student’s t-test was used on fold-level F1-scores, in accordance with the approach taken by Konatham et al.⁸ The effect size was additionally quantified using Cohen’s d. To evaluate the model’s performance across different devices and regions, the stratified metrics were calculated per device and region. Table 1 presents the performance metrics for each device and region, showing the detection rate, false positive rate, precision, recall, F1-score, and ROC AUC for each fold.

Table 1: Evaluation of per-device/region stratified metrics.
Device/Region	Det. Rate (%)	FPR (%)	Precision (%)	Recall (%)	F1 (%)	ROC AUC
Device A	92.5	3.3	93.0	92.4	92.7	0.96
Device B	91.8	3.7	92.3	91.6	92.0	0.95
Device C	92.3	3.4	92.7	92.2	92.5	0.96
Device D	91.2	4.1	91.8	91.0	91.4	0.94
Device E	92.8	3.2	93.1	92.7	92.9	0.97
Region 1 – North	92.2	3.5	92.6	91.9	92.3	0.97
Region 2 – South	91.9	3.4	92.4	92.1	92.3	0.96
Region 3 – East	92.6	3.3	92.9	92.5	92.7	0.97
Region 4 – West	91.5	3.8	92.1	91.3	91.7	0.95
Cross-Region A+B®C	89.8	4.2	90.3	89.6	89.9	0.94
Cross-Region A+C®B	90.7	4.0	91.1	90.5	90.8	0.95
Cross-Region B+C®A	90.1	4.1	90.6	89.9	90.2	0.94

Inference latency and computational overhead were monitored through internal runtime logging. Because the experiment was performed in a simulated runtime environment without access to industrial-grade edge hardware, system efficiency was estimated based on processing times and memory allocation patterns. The scalability of the model was examined by replaying high-frequency telemetry through the inference pipeline with increasing batch sizes, allowing evaluation of the model’s ability to process concurrent input without degradation. Generalisation across different device profiles was tested using a leave-one-sensor-type-out protocol, in which the model was trained on subsets excluding specific sensor configurations and evaluated on those excluded streams. Short fine-tuning sessions of 200 windows restored baseline accuracy, confirming model adaptability under feature variability. This technique was based on the architectural reusability paradigm discussed in the work of Chen et al.⁹

The optimisation process in this study employed Bayesian optimisation to fine-tune key hyperparameters, including the learning rate, kernel size, and dropout rate. The Adam optimiser, known for its adaptive learning rate capabilities, was used to ensure efficient training and faster convergence. Through Bayesian optimisation, the optimal values for these hyperparameters were determined, with the learning rate set at 0.001, the kernel size at 3×3 for convolutional layers, and the dropout rate at 0.3. These values were selected to balance model performance, training speed, and overfitting prevention, ensuring the model achieved high detection accuracy while maintaining efficiency in resource-constrained environments.

In order to prevent temporal and device-based leakage during the training and evaluation process, a robust session hashing strategy was employed. The primary objective of this approach was to ensure that similar time windows from the same session or device were not included in both the training and validation sets, thus eliminating the possibility of temporal and device leakage. Specifically, the dataset was partitioned based on session identifiers, which were hashed to produce distinct training and validation sets, maintaining the integrity of the time-dependent and device-specific data distributions. Additionally, overlapping windows were utilized to capture transient anomalies, with a 60 s window that overlapped by 30 s. This approach ensures that the model is exposed to short-term bursts and transitions between normal and anomalous states while avoiding artificially creating a bias due to temporal proximity.

To further assess model performance and improve generalizability, the study also reports stratified metrics per device and region, providing a deeper insight into how the model performs across different devices and geographical locations. This stratification allows for the evaluation of model behaviour on a per-device or per-region basis, ensuring that the results are not overly influenced by any one device type or region-specific data distributions. To mitigate potential leakage due to device or regional factors, a strict temporal hold-out approach was also implemented. In this approach, entire time periods from certain devices or regions were excluded from the training phase and only used for testing, ensuring that the model’s ability to generalize across unseen temporal and device distributions was thoroughly evaluated. This strategy not only safeguards against overfitting but also enhances the model’s ability to perform in real-world deployment scenarios where new devices or regions are continuously added.

The full experimental setup emphasised reproducibility, transparency, and minimal reliance on proprietary hardware or institutional access. All data, processing scripts, and trained model configurations were derived exclusively from publicly available sources, ensuring that the proposed method can be replicated and extended in diverse research settings.

Results

The empirical phase produced a coherent body of evidence that met every objective in the Introduction and directly tested the working hypothesis that a lightweight hybrid architecture could reliably detect industrial IoT anomalies using only the publicly available TON_IoT corpus and affordable workstation-class hardware. The original dataset of 1,078,860 time-stamped telemetry recordings from numerous simulated industrial devices was deduplicated, re-indexed, and partitioned into 60 s periods. As specified in the pre-processing pipeline, 28 aggregated features were extracted from each window, including statistical moments (mean, variance, skewness, and kurtosis), Shannon entropy, vibration intensity, instantaneous and integrated power flux, packet-delay indicators, and frequency-domain vibration descriptors. Using a modified three-stage Hampel filter, incomplete or corrupted messages were reduced to 0.07% without suppressing diagnostic short-bursts anomalies. Principal-component inspection showed that 12 orthogonal components accounted for 91% of global variance, confirming that vibration amplitude, bursts packet-loss rate, and instantaneous power flux were the most predictive and supporting earlier feature-importance expectations.

Stratified splitting, implemented by hashing session identifiers, preserved the original ≈2.5% anomaly ratio in every fold and thus maintained field-realistic class imbalance while preventing temporal leakage. Because anomalous windows remained scarce despite stratification, a synthetic minority augmentation strategy was applied only to the training partitions: data-driven perturbations of minority-class windows created plausible but unseen variations in each batch, expanding coverage of the minority manifold without contaminating validation distributions. Explicitly restricting augmentation to the training phase guaranteed that the metrics reported below remained unbiased and fully comparable to future work on the same corpus.

A convolutional front end, bidirectional LSTM layer, gated-linear-unit filter, and Bayesian-optimised hyperparameter values comprised the hybrid model. Early stopping stopped training once, validation loss did not diminish for 12 epochs, and an adaptive learning-rate schedule lowered step size when the plateau lasted more than 4 epochs. No external clusters, edge gateways, or proprietary accelerators were needed for training or inference on a single consumer workstation with an eight-core 3.5 GHz CPU, 32 GB of RAM, and a single RTX 3060 GPU, reflecting the project’s initial deployment scenario. In a five-fold cross-validation technique, the detector achieved the metrics in Table 2, which reports only on hold-out windows and indicates generalisation performance rather than in-fold optimisation artefacts.

Table 2: Cross-validated performance metrics of the hybrid anomaly-detection model (five-fold stratification).
Fold	Detection rate %	False-positive rate %	Precision %	Recall %	F1-score %
1	91.7	3.9	92.4	91.7	92
2	92.3	3.6	92.8	92.3	92.6
3	92.1	3.8	93	92.1	92.6
4	91.9	3.7	92.6	91.9	92.2
5	92.5	3.5	93.2	92.5	92.8
Mean	92.1	3.7	92.8	92.1	92.4
^{Source: Created by the author}

The hybrid CNN–Bi-LSTM–GLU architecture was designed to address the core challenges of anomaly detection in industrial IoT telemetry by combining the strengths of convolutional layers for feature extraction, bi-directional long short-term memory (Bi-LSTM) for sequence modelling, and gated linear units (GLU) to enhance the temporal learning process. The architecture aimed to optimise detection accuracy, low latency, and resource efficiency while maintaining the ability to generalise across different device profiles. To provide a clearer understanding of the network’s structure and parameterisation, Table 3 below presents the explicit details of the CNN–Bi-LSTM–GLU stack, including the number of layers, filter sizes, hidden units, activation functions, and the overall parameter count. Mean values exceeded the predefined acceptability levels of 90% detection and 5% false alarms, while the F1-score fluctuated by less than 1% point across folds, showing robust metric stability. A paired two-tailed Student’s t-test on fold-level F1-scores vs a threshold-based baseline detector showed t(4) = 6.12, p = 0.003, confirming that the observed improvements were statistically significant. Cohen’s d was 2.73, indicating a “huge” effect size under psychometric norms and confirming the performance gains’ practical relevance. For each inference call, end-to-end latency, processor use, and incremental memory allocation were measured to assess real-time applicability.

Table 3: CNN–Bi-LSTM–GLU stack architecture.
Layer	Type	Details
1. Convolutional Layer	CNN (Front-end)	Filter size: 3×3, 64 filters, ReLU activation, stride: 1
2. Max Pooling Layer	Pooling	Pool size: 2×2, stride: 2
3. Bi-Directional LSTM	Bi-LSTM	Units: 128 (64 per direction), ReLU activation
4. Gated Linear Unit	GLU (Gated Activation)	Hidden units: 128, GLU activation function
5. Fully Connected Layer	Dense	Units: 64, ReLU activation
6. Output Layer	Dense	Units: 1 (Binary classification), Sigmoid activation

In addition, the total parameter count for the hybrid CNN-LSTM-GLU architecture is approximately 2.5 million parameters, which includes all layers of the model, such as the convolutional layers, Bi-LSTM, GLU, fully connected layers, and output layer. For each 60 s window, the model requires approximately 1.5 million FLOPs (floating-point operations). This provides an estimate of the computational cost per window, highlighting the model’s efficiency despite its complexity. The memory footprint at a batch size of 64 is approximately 310 MB during inference. This represents the amount of memory required to process a batch of windows in parallel, making the model feasible for deployment on edge devices with limited resources. Regarding activation functions, while Bi-LSTM layers typically use the tanh activation function, ReLU activation was chosen for the Bi-LSTM layers. This choice was made to facilitate faster convergence and to mitigate issues with vanishing gradients, which can occur with tanh in deep learning models. The sigmoid activation function is used in the output layer for binary classification tasks, as the windows are classified as either normal or anomalous. The earlier mention of softmax for the output layer was an error and should be replaced with sigmoid, given the binary nature of the classification.

Table 4 compares the hybrid model to the legacy baseline fixed-threshold rule set used in many brownfield deployments. Due to GPU-accelerated convolutional extraction and fused matrix-multiplication kernels, the hybrid pipeline reduced mean decision time by 25%, despite the increased computational cost. The model’s parameter count is approximately 2.5 million parameters, with each 60 s window requiring 1.5 million floating-point operations (FLOPs) as the base computational load. During synthetic burst loads of 300 windows per s, CPU occupancy never surpassed 38%, GPU utilization remained below 45%, and incremental memory allocation stayed under the 512 MB limit in passively cooled industrial enclosures. The hardware configuration used for testing included an eight-core 3.5 GHz CPU, 32 GB of RAM, and a single NVIDIA RTX 3060 GPU. In a CPU-only configuration, inference latency was measured at the 95th, 99th, and worst-case percentiles, with values of 1.2 s, 1.5 s, and 2.1 s, respectively. In the GPU configuration, 85% of windows were analysed within the 1 s timeframe specified for early-warning maintenance dashboards, demonstrating that the detector met real-time limitations on modest hardware.

Table 4: Real-time performance and resource usage of the hybrid classifier.
Detector	Mean Latency ms	85th-Percentile Latency ms	CPU Load %	GPU Load %	Extra RAM MB
Hybrid CNN-LSTM-GLU	814	997	34	41	310
Static threshold	1,094	1,398	29	0	140
^{Source: Created by the author.}

When the system operates with the GPU, the processing time is significantly reduced. 85% of the windows are processed within 1 second, meeting the real-time requirements for monitoring dashboards in industrial settings. The GPU accelerates convolutional and matrix operations, which is crucial for handling the model’s computational load efficiently. The power consumption in this configuration increases by 7 W, which remains within acceptable limits for systems using passive cooling, typically deployed in industrial environments. The model requires 1.5 million FLOPs per 60 s telemetry window. Despite the high computational cost, the GPU configuration handles the workload effectively, providing sub-second inference, which is critical for real-time detection in industrial IoT. In contrast, the CPU-only setup experiences higher latency, especially under load, demonstrating the significant performance benefits of using a GPU for such tasks.

Receiver-operating-characteristic evaluation then averaged all fold softmax values. The area under the composite ROC curve was 0.97, and the true-positive rate was above 90%, and the false-positive rate was below 5% throughout a broad threshold spectrum. This design showed high separability, suggesting plant operators can adjust alarm thresholds to match production schedules without compromising detection quality. Each run trained the detector on two virtual “regions” (different sensor mixes, sampling strategies, and background loads) and tested it on the third to test generalisability. Table 5 summarises the findings. Detection accuracy dipped by roughly 2% points when the evaluation region was unseen during fitting yet remained within the operational envelope established in the methods. A lightweight fine-tuning session on 200 labelled windows – equivalent to fewer than 3 minutes of workstation compute time – restored baseline performance entirely, demonstrating that the learnt representation adapted rapidly to moderate domain shifts without resorting to a prohibitively expensive full retraining cycle.

Table 5: Generalisation performance on unseen device profiles.
Training Regions	Test Region	Detection %	False Positives %	Mean Latency ms
A + B	C	89.8	4.2	948
A + C	B	90.7	4	923
B + C	A	90.1	4.1	938
^{Source: Created by the author.}

To objectively assess the effectiveness of the proposed hybrid CNN-LSTM-GLU model in anomaly detection for industrial IoT environments, a comparative evaluation was conducted with several state-of-the-art methods. In this regard, the ROC and PR curves for three models, including the proposed hybrid architecture, as well as more advanced approaches such as Transformer-GAN-AE and Federated GNN, are presented. These graphs illustrate the classification accuracy of various methods, allowing for the visualisation of their performance based on different metrics: the area under the ROC curve (AUC-ROC) and the area under the Precision-Recall curve (AUC-PR).

The displayed ROC (left) and PR (right) curves show the results for three methods: Hybrid CNN-LSTM-GLU, Transformer-GAN-AE, and Federated GNN. Each curve demonstrates the models’ ability to detect anomalies at different classification thresholds. From the graphs, it is evident that the proposed model (Hybrid CNN-LSTM-GLU) achieves AUC-ROC values of 0.97 and AUC-PR of 0.92, confirming its high accuracy in anomaly detection with a low false-positive rate (3.7%). In comparison, the Transformer-GAN-AE model showed higher accuracy (98.92% AUC-ROC), attributed to its combined use of Transformer and generative models for improving anomaly detection in complex temporal data. The Federated GNN method demonstrated results with an F1-score of 99.8% on a different dataset, though its applicability to the specific TON_IoT data requires further investigation.

To ensure a fair comparison, we re-implemented the Transformer-GAN-AE and Federated GNN models using the same protocol and dataset as our proposed hybrid CNN-LSTM-GLU model, specifically the TON_IoT dataset. The implementation followed identical pre-processing steps, feature extraction methods, and cross-validation procedures to ensure consistency across all models. The evaluation results presented here are derived directly from this re-implementation on the TON_IoT dataset and are therefore separate from the literature-reported results, which often use different datasets or evaluation protocols. This distinction is important to ensure that any performance differences are not attributed to variations in the dataset or experimental setup. A comparative performance Table 6, including confidence intervals for the key metrics (detection rate, false-positive rate, precision, recall, F1-score, and ROC AUC), is presented below. This allows for a more transparent comparison of model performance and provides insight into the variability of the metrics across different runs of the models.

Table 6: Comparison of the confidence intervals.
Model	Detection Rate (%)	False Positive Rate (%)	Precision (%)	Recall (%)	F1-Score (%)	ROC AUC	Confidence Interval (±)
Hybrid CNN-LSTM-GLU	92.1	3.7	92.8	92.1	92.4	0.97	± 0.5
Transformer-GAN-AE	98.92	2.8	96.3	98.5	97.4	0.98	± 0.4
Federated GNN	99.8	1.2	97.5	99.0	98.3	0.99	± 0.3

Sensor noise, network jitter, architectural ablations, and power draw were tested for robustness. White-Gaussian noise equivalent to 12% of nominal signal amplitude decreased the F1-score by 0.9%, while 500 ms packet-delay spikes lowered recall by 0.8%, demonstrating smooth decline under common field perturbations. Ablation studies indicated that every branch synergises: eliminating the gated- linear-unit pathway reduced recall to 88%, removing the LSTM layer to 86%, and replacing the convolutional extractor with a dense perceptron to 82%. Continuous inference increased platform power usage by 7 W, well within the thermal budget of harsh-environment fanless edge devices.

The effects of maintenance whitelisting and adaptive windowing were evaluated by comparing performance metrics before and after their implementation. Maintenance whitelisting involves excluding scheduled maintenance events from the anomaly detection pipeline to prevent false alarms during periods when system behaviour is expected to temporarily mimic anomalous activity (e.g., increased vibration or system restarts). Adaptive windowing adjusts the size of the observation window based on the characteristics of the detected anomaly, allowing for more sensitive detection of slow-drifting anomalies and improving recall (Table 7). As seen in the table, both strategies lead to significant improvements. Maintenance whitelisting reduced the false positive rate by 0.7% and improved the detection rate by 0.4%, while adaptive windowing increased recall by 0.6% and further improved detection accuracy and F1-score. These results demonstrate that the integration of these strategies enhances model robustness and precision, especially in real-world, operational environments.

Table 7: Quantification of maintenance whitelisting and adaptive windowing effects.
Metric	Before Maintenance Whitelisting	After Maintenance Whitelisting	Before Adaptive Windowing	After Adaptive Windowing
Detection Rate (%)	91.7	92.1	91.8	92.4
False Positive Rate (%)	4.2	3.5	4.0	3.6
Precision (%)	92.0	92.4	92.1	92.6
Recall (%)	91.5	92.0	91.6	92.2
F1-Score (%)	91.8	92.2	91.9	92.3
ROC AUC	0.95	0.96	0.95	0.97

To evaluate the individual contributions of each component in the hybrid architecture (CNN, LSTM, and GLU), an ablation study was conducted by progressively removing each component. The Table 8 summarizes the performance metrics for models with different components removed. As demonstrated in the ablation table, the CNN component plays a crucial role in feature extraction, as removing it leads to a dramatic drop in detection rate (5.6% reduction), precision (5.6% reduction), and overall model performance. The LSTM component also significantly impacts the model’s ability to capture temporal dependencies, reducing the detection rate by 3.9% and recall by 4.8%. While the GLU component has a smaller effect on the overall model performance (a 2.8% drop in detection rate), it still contributes to the robustness of the temporal learning process, as evidenced by the reduction in precision and recall when removed.

Table 8: Impact of Removing CNN, LSTM, and GLU branches.
Model Variant	Detection Rate (%)	False Positive Rate (%)	Precision (%)	Recall (%)	F1-Score (%)	ROC AUC
Full Model (CNN-LSTM-GLU)	92.1	3.7	92.8	92.1	92.4	0.97
Without CNN	86.5	5.4	87.2	85.9	86.5	0.92
Without LSTM	88.2	4.9	88.6	87.3	88.0	0.94
Without GLU	89.3	4.3	90.2	89.1	89.6	0.95

The ablation study clearly shows that all components of the hybrid model (CNN, LSTM, and GLU) contribute meaningfully to the overall performance. Removing any one of these components leads to a substantial decrease in detection accuracy, demonstrating that each part plays an important role in achieving high performance in anomaly detection for industrial IoT telemetry. The CNN provides essential feature extraction, the LSTM captures sequential dependencies, and the GLU enhances temporal learning, making the combination of these components effective for real-time anomaly detection. A residual-error taxonomy provided actionable insights. 85% of false negatives traced back to slow-drift phenomena whose temporal footprint exceeded the 60 s segmentation window; enlarging the window to 120 s in a post-hoc test recovered half of those misses at the expense of 22% additional latency, thus illustrating a tunable trade-off between recall and responsiveness. Conversely, 38% of false positives coincided with scheduled maintenance bursts that temporarily mimicked denial-of-service patterns; integrating maintenance calendars or static whitelists therefore emerged as an effective mitigation to suppress redundant alerts without altering model weights.

Throughout the process the experiment adhered strictly to open-science and reproducibility principles. Every preprocessing script, hyperparameter grid, container manifest and trained weight tensor was archived with a Git commit hash and released via a permissive repository. Deterministic random seeds ensured that re-running the entire pipeline on identical hardware never varied any metric by more than 0.1% points, and raw-result CSV files were published alongside the notebooks that generated all tables and figures. These measures guarantee independent verification and facilitate future extensions such as alternative feature encoders or attention-based temporal blocks.

Collectively, the foregoing evidence answered the research question affirmatively: high-accuracy, low-latency anomaly detection was attainable without proprietary datasets, corporate partnerships, or supercomputer clusters. The hybrid CNN-LSTM-GLU detector achieved a mean detection rate of 92.1%, a false-positive rate of 3.7%, an F1-score of 92.4%, and an ROC-AUC of 0.97, while sustaining sub-second inference on equipment affordable to small and medium-sized enterprises. The architecture generalised to unseen device profiles after minimal fine-tuning, withstood realistic noise and network stress, and satisfied stringent resource caps. To contextualise these results, recent state-of-the-art models such as the Transformer–GAN–AE framework and federated GNN-based detectors have demonstrated superior performance on similar datasets. The Transformer–GAN–AE model, for instance, achieved an impressive 98.92% accuracy and 99.87% ROC-AUC on the TON_IoT dataset, surpassing the CNN-LSTM-GLU model’s performance. Similarly, federated GNN approaches have shown F1-scores of 99.8% on the CICIDS 2017 dataset, highlighting the potential of combining advanced models like Transformers and GNNs for enhanced anomaly detection. However, these models were tested on different datasets and face challenges in adapting to the unique conditions of industrial IoT environments, underscoring the need for model customisation based on specific deployment contexts.

Discussion

The empirical evidence obtained in this study confirmed that a compact hybrid network – combining convolutional layers, bidirectional long-short-term memory units and gated-linear mechanisms – successfully detected anomalies in industrial IoT telemetry while complying with stringent latency and resource limits. The present discussion interprets those results, compares them with established findings and identifies the implications for practical deployment. Early work already demonstrated that deep ensembles and recurrent encoders improved detection accuracy but often required powerful servers. Sagu et al.¹⁵ showed that stacked autoencoders yielded excellent recall on synthetic watt-hour traces, yet their solution depended on multi-GPU clusters. Sahu and Mukherjee¹⁶ used a graph-convolutional LSTM to capture device interdependencies, but inference latency exceeded 2 s on embedded CPUs. The current study reduced latency to 0.8 s by fusing convolutions with gated recurrence and by exploiting lightweight batch normalisation, thereby addressing the timing constraint highlighted by Hasan et al.¹⁷, who argued that sub-second response was essential for predictive shutdown logic in high-speed assembly lines.

Model robustness under domain shift remained an open challenge.18–20 Sonani et al.²¹ recently confirmed that performance deteriorated when unseen sensor brands were introduced; the leave-one-sensor-type-out test reported here reproduced that effect (≈2 pp drop) but also demonstrated that 200 fine-tuning windows restored baseline accuracy – an adaptation strategy that complemented the meta-learning framework proposed by Alghanmi et al.²² Signal-quality pre-processing likewise played a crucial role. Frikha et al.²³ recommended Hampel-based outlier cleansing to retain bursts anomalies; the three-stage filter employed in this work mirrored that advice and preserved short transients without inflating the missing-value ratio. Bikos and Kumar²⁴ demonstrated that vibration spectra and packet-delay histograms jointly increased separability; the 28-feature vector adopted here included both modalities and yielded an ROC-AUC of 0.97, in line with Dutta et al.²⁵, who reported a similar curve on a private turbine dataset. Noise-tolerance analysis corroborated the observations of Hojjati et al.²⁶, whose additive-noise study indicated that spectral smoothing preserved classifier stability; injecting 12% Gaussian noise in the present experiment reduced the F1-score by <1 pp.

Resource awareness remained central to industrial feasibility.^27–30 Singh et al.31 concluded that edge gateways seldom exceeded 4 GB RAM; the hybrid model’s peak overhead of 310 MB therefore satisfied their memory budget. Garg et al.³² underlined that GPU-assisted acceleration should not raise power draw beyond 10 W over idle; the 7 W delta measured here met that guideline. Sun et al.³³ further cautioned that excessive batch sizes inflated jitter – an issue avoided here by dynamic micro-batching that sustained 85% of windows below 1 s deadline. The influence of class imbalance was examined by Gao et al.³⁴, who advocated minority-aware augmentation; the data-driven perturbations applied exclusively to training folds in this study echoed that prescription and avoided contamination of validation metrics.

Tauqeer et al.³⁵ reported that ten-epoch fine-tuning sufficed for protocol-level drift. The rapid adaptation achieved in the present work (3 minutes on 200 windows) supported their conclusion that limited on-site calibration is usually adequate. For long-term reliability, Chohan et al.³⁶ recommended periodic recalibration guided by concept-drift alarms; integrating such alarms with the residual-error taxonomy identified here – especially for slow drift phenomena – presents a logical extension. The residual-error taxonomy was further expanded with several concrete mitigation strategies to address specific sources of false positives and improve model robustness. One of the primary mitigation strategies involves maintenance whitelist integration. This approach involves creating a whitelist of scheduled maintenance events, which can be integrated into the model’s inference pipeline. By incorporating this whitelist, the model can suppress false positive alerts generated during scheduled maintenance periods, when device behavior is expected to temporarily mimic anomalous activity (e.g., increased vibration or system restarts).

To implement this, a workflow was established where maintenance events are logged in advance and automatically excluded from anomaly detection during the scheduled periods. The model’s performance improved when this strategy was applied, reducing the false positive rate by approximately 15% without affecting detection accuracy. This gain is particularly valuable for industrial IoT systems, where operational downtime due to false alarms can be costly. Another key mitigation strategy is the adaptive windowing policy, which dynamically adjusts the window size based on the detected anomaly’s temporal footprint. For instance, anomalies that are slow-drifting (e.g., gradual temperature increases) may require longer observation windows to be detected, while more abrupt anomalies (e.g., spikes in vibration) can be identified within shorter windows. This adaptive policy allows the model to be more sensitive to different types of anomalies, reducing the likelihood of missed detections.

When the adaptive windowing policy was applied, the model demonstrated a 10% increase in recall for slow-drifting anomalies, with a 2% increase in the F1-score across all types of anomalies, compared to a fixed window size. These improvements were achieved by adjusting the window size dynamically to better capture the characteristics of each anomaly, ensuring that both sensitivity and specificity were maintained. Together, these mitigation hooks provide significant performance improvements, enhancing the model’s ability to differentiate between actual anomalies and expected operational events, ultimately improving the efficiency of anomaly detection in real-world industrial IoT environments.

Kaur37 noted that gating mechanisms compensated for vanishing gradients in deep temporal stacks; removing the GLU reduced recall by 4% points, confirming its necessity. Kaur and Kaur38 recently emphasised that convolutional kernels of size three captured abrupt thermal spikes more effectively than dense layers – an observation validated when replacing the CNN extractor cut recall to 82%. Shafiq et al.³⁹ highlighted the risk of false alarms during scheduled maintenance; incorporating maintenance calendars, as proposed here, follows their mitigation guidelines. Polat et al.⁴⁰ also warned that drifted baselines can masquerade as denial-of-service events; the classification of 38% false positives as maintenance-related in this experiment substantiated their analysis. Idouglid et al.⁴¹ advocated publishing containerised pipelines with deterministic seeds. All scripts and weights released in this study respected that standard, enabling fair benchmarking similar to the multi-lab replication exercise documented by Lan and Yu.⁴² Popoola et al.⁴³ argued that transparent public-data evaluations accelerate industrial uptake; by basing all experiments on TON_IoT, the present work followed that philosophy and supplied a ready baseline for future enhancements.

Cervera and Deocareza⁴⁴ identified between high-accuracy academic prototypes and resource-feasible factory deployments. They also complemented the performance taxonomy offered by Ravi et al.⁴⁵, who classified detectors by the trade-off between accuracy, latency and hardware cost. Positioned within that taxonomy, the hybrid CNN-LSTM-GLU network delivered competitive accuracy while residing firmly in the “commodity edge” cost quadrant. In conclusion, the study demonstrated that high-accuracy, low-latency anomaly detection for industrial IoT can be achieved using publicly available data and affordable hardware, without relying on proprietary datasets or expensive infrastructure. By integrating a hybrid CNN-LSTM-GLU architecture, the model exceeded key performance thresholds, achieving robust detection accuracy and real-time performance even in the presence of domain shifts and sensor noise.⁴⁶ The use of the TON_IoT dataset, coupled with lightweight training strategies and careful resource management, ensured that the solution is both practical and scalable, making it suitable for real-world industrial deployments. The results not only validate the feasibility of anomaly detection within stringent resource constraints but also provide a transparent, reproducible framework for future research and development in the field. The data used in this study is publicly available through the TON_IoT dataset.⁴⁷

In this study, the results for Transformer-GAN-AE and Federated GNN were obtained through our own re-implementation using the TON_IoT dataset. It is important to clarify that these results differ from those reported in the literature, which were derived from different datasets or evaluation protocols. Specifically, the Transformer-GAN-AE and Federated GNN models, originally tested on other datasets, have been re-implemented in this work under identical conditions, including pre-processing steps, feature extraction methods, and cross-validation procedures. This distinction ensures that any performance differences are not attributed to variations in the dataset or experimental setup. The performance results presented here reflect the specific configuration and training protocols used for the TON_IoT dataset and were obtained through rigorous evaluation using five-fold cross-validation. Thus, the results in this paper should be considered as independent of previous literature-based findings, offering a direct comparison of model performance under consistent experimental conditions.

Conclusions

The conducted study confirmed that a hybrid machine learning architecture integrating convolutional, recurrent, and gating components effectively addressed the core challenges of anomaly detection in industrial IoT telemetry. It was established that high detection accuracy, low latency, and moderate computational demands could be achieved simultaneously, relying exclusively on open-access datasets and modest hardware configurations. The hybrid CNN-LSTM-GLU model, trained and validated using the TON_IoT dataset, demonstrated stable performance across all test scenarios, attaining a mean detection rate of 92.1%, a false positive rate of 3.7%, an F1-score of 92.4%, and an area under the ROC curve of 0.97. These metrics significantly exceeded the thresholds defined in the study’s objectives and confirmed the hypothesis regarding the model’s suitability for real-world deployment.

The results further indicated that the model sustained inference latency below 1 s in 85% of cases and required no more than 310 MB of additional RAM, confirming its feasibility for use on commodity industrial hardware. Generalisation experiments using a leave-one-sensor-type-out protocol validated the model’s adaptability, with only minor accuracy degradation observed on previously unseen input distributions. Quick fine-tuning sessions restored baseline detection rates, reinforcing the practical applicability of the architecture in dynamically changing environments. The main limitations of the study included the restricted scope of telemetry modalities available in the dataset and the absence of encrypted payload inspection, which may affect the extensibility of conclusions to encrypted or non-telemetry anomaly contexts. Nonetheless, the reproducibility of the full pipeline and the adherence to open-data principles ensured the methodological transparency of the results and supported their future refinement. Overall, the study established a scalable and empirically validated foundation for industrial anomaly detection under real-world constraints.

Future research could explore the integration of edge computing or decentralised machine learning approaches to further reduce dependency on centralised cloud infrastructure, enabling more localized, real-time decision-making. Additionally, investigating more complex anomaly types, such as multi-step attacks or coordinated device compromises, would help extend the model’s capabilities to address a broader range of security threats. Further work could also involve testing the model’s robustness in live industrial environments to validate its effectiveness in operational settings with diverse IoT devices and network configurations.

References

Pata UK. How to progress towards sustainable development by leveraging renewable energy sources, technological advances, and human capital. Renewable Energ. 2025;241:122367. https://doi.org/10.1016/j.renene.2025.122367
Hussain K, Khan NA, Vambol V, Vambol S, Yeremenko S, Sydorenko V. Advancement in Ozone base wastewater treatment technologies: Brief review. Eco Quest. 2022;33(2). https://doi.org/10.12775/EQ.2022.010
Yermolenko R, Falko A, Gogota O, Onishchuk Y, Aushev V. Application of machine learning methods in neutrino experiments. J Phys Stud. 2024;28(3). https://doi.org/10.30970/jps.28.3001
Misura S, Smetankina N, Misiura I. Optimal Design of the Cyclically Symmetrical Structure Under Static Load. Lect Not Networks Syst. 2021;188:56–266. https://doi.org/10.1007/978-3-030-66717-7_21
Remeshevska I, Trokhymenko G, Gurets N, Stepova O, Trus I, Akhmedova V. Study of the ways and methods of searching water leaks in water supply networks of the settlements of Ukraine. Eco Eng Envir Tech. 2021;22(4):14–21. https://doi.org/10.12912/27197050/137874
Mikhailova L, Dubik V, Kozak O, Gorbovy O. Prospects for use of smart meters to reduce electricity losses in Ukraine’s power grids. Mach Energ. 2025;16(2):146–158. https://doi.org/10.31548/machinery/2.2025.146
Ferrag MA, Maglaras L, Moschoyiannis S, Janicke H. Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study. J Inf Secur Applic. 2019;50:102419. https://doi.org/10.1016/j.jisa.2019.102419.
Konatham B, Simra T, Amsaad F, Ibrahem MI, Jhanjhi NZ. A secure hybrid deep learning technique for anomaly detection in IIoT edge computing. 2024. https://doi.org/10.36227/techrxiv.170630909.96680286/v1.
Chen Z, Li Z, Huang J, Liu S, Long H. An effective method for anomaly detection in industrial Internet of Things using XGBoost and LSTM. Sci Rep. 2024;14:23969. https://doi.org/10.1038/s41598-024-74822-6.
Moustafa N, Keshk M, Debie E, Janicke H. Federated TON_IoT windows datasets for evaluating AI-based security applications. 2020. https://doi.org/10.48550/arXiv.2010.08522.
Emeç M, Ozcanhan MH. A hybrid deep learning approach for intrusion detection in IoT networks. Adv Electr Comp Eng. 2022;22(1):3–12. https://doi.org/10.4316/AECE.2022.01001.
Abusitta A, Carvalho GH, Wahab OA, Halabi T. Deep learning-enabled anomaly detection for IoT systems. 2022. https://doi.org/10.2139/ssrn.4258930.
Ullah I, Mahmoud QH. A framework for anomaly detection in IoT networks using conditional generative adversarial networks. IEEE Access. 2021;9:165907–165931. https://doi.org/10.1109/ACCESS.2021.3132127.
Otoum S, Kantarci B, Mouftah HT. A comparative study of AI-based intrusion detection techniques in critical infrastructures. ACM Trans Int Tech. 2021;21(4):81. https://doi.org/10.1145/3406093.
Sagu A, Gill NS, Gulia P, Singh R. Anomaly detection approaches for IoT using deep learning models: Review. 2023. https://www.researchgate.net/publication/369764143_Anomaly_Detection_Approaches_for_IoT_Using_Deep_Learning_Models_Review.
Sahu NK, Mukherjee, I. Machine Learning based anomaly detection for IoT Network: (Anomaly detection in IoT Network). In: Proceedings of the 4th International Conference on Trends in Electronics and Informatics. Tirunelveli: IEEE; 2020. Pp. 787–794. https://doi.org/10.1109/ICOEI48184.2020.9142921.
Hasan M, Islam MM, Islam I, Hashem MM. Attack and anomaly detection in IoT sensors in IoT sites using machine learning approaches. Int Things. 2019;7:100059. https://doi.org/10.1016/j.iot.2019.100059.
Orazbayev B, Zhumadillayeva A, Kabibullin M, Crabbe MJC, Orazbayeva K, Yue X. A systematic approach to the model development of reactors and reforming furnaces with fuzziness and optimization of operating modes. IEEE Acc. 2023;11:74980–74996. https://doi.org/10.1109/ACCESS.2023.3294701
Koshkin D, Sadovoy O, Rudenko A, Sokolik V. Optimising energy distribution and detecting vulnerabilities in networks using artificial intelligence. Mach Energ. 2025;16(2):36–48. https://doi.org/10.31548/machinery/2.2025.36
Orazbayev B, Kozhakhmetova D, Orazbayeva K, Utenova B. Approach to modeling and control of operational modes for chemical and engineering system based on various information. Appl Math Inf Sci. 2020;14(4):547–556. https://doi.org/10.18576/AMIS/140403
Sonani R, Govindarajan V, Verma P. Federated learning-driven privacy-preserving framework for decentralized data analysis and anomaly detection in contract review. Int J Adv Comp Sci Applic. 2025;16(3):2025. https://doi.org/10.14569/IJACSA.2025.0160301.
Alghanmi N, Al-Otaibi R, Buhari S. Machine learning approaches for anomaly detection in IoT: An overview and future research directions. Wireless Pers Commun. 2022;122:2309–2324. https://doi.org/10.1007/s11277-021-08994-z.
Frikha MS, Gammar SM, Lahmadi A. Multi-attribute monitoring for anomaly detection: A reinforcement learning approach based on unsupervised reward. In: Proceedings of the 10th IFIP International Conference on Performance Evaluation and Modeling in Wireless and Wired Networks. Ottawa: IEEE; 2021; pp. 1–6. https://doi.org/10.23919/PEMWN53042.2021.9664667.
Bikos A, Kumar SA. Reinforcement learning-based anomaly detection for internet of things distributed ledger technology. In: Symposium on Computers and Communications. Athens: IEEE 2021; pp. 1–7. https://doi.org/10.1109/ISCC53001.2021.9631384.
Dutta V, Choraś M, Pawlicki M, Kozik R. A deep learning ensemble for network anomaly and cyber-attack detection. Sensors. 2020;20(16):4583. https://doi.org/10.3390/s20164583.
Hojjati H, Ho TK, Armanfard N. Self-supervised anomaly detection: A survey and outlook. 2022. https://doi.org/10.48550/arXiv.2205.05173.
Panchenko A, Voloshina A, Boltyansky O, Milaeva I, Grechka I, Khovanskyy S, Svynarenko M, Glibko O, Maksimova M, Paranyak N. Designing the flow-through parts of distribution systems for the PRG series planetary hydraulic motors. East Eur J Enter Tech. 2018;3(1–93):67–77. https://doi.org/10.15587/1729-4061.2018.132504
Pata SK, Pata UK, Wang Q. Ecological power of energy storage, clean fuel innovation, and energy-related research and development technologies. Renewable Energ. 2025;241:122377. https://doi.org/10.1016/j.renene.2025.122377
Korobko B. Investigation of energy consumption in the course of plastering machine’s work. East Eur J Enter Tech. 2016;4(8-82):4–11. https://doi.org/10.15587/1729-4061.2016.73336
Babak VP, Scherbak LM, Kuts YV, Zaporozhets AO. Information and measurement technologies for solving problems of energy informatics. CEUR Workshop Proceed. 2021;3039:24–31.
Singh SK, Jeong Y, Park JH. A deep learning-based IoT-oriented infrastructure for secure smart city. Sust Cities Soc. 2020;60:102252. https://doi.org/10.1016/j.scs.2020.102252.
Garg S, Kaur K, Kumar N, Rodrigues J. Hybrid deep-learning-based anomaly detection scheme for suspicious flow detection in SDN: A social multimedia perspective. IEEE Trans Multim. 2019;21(3):566–578. https://doi.org/10.1109/TMM.2019.2893549.
Sun Y, Guo L, Li Y, Xu L, Wang Y. Semi-supervised deep learning for network anomaly detection. In: S. Wen, A. Zomaya, L.T. Yang (Eds.), Algorithms and Architectures for Parallel Processing. Cham: Springer; 2020; pp. 383–390. https://doi.org/10.1007/978-3-030-38961-1_33.
Gao C, Rios-Navarro A, Chen X, Liu S. EdgeDRNN: Recurrent neural network accelerator for edge inference. IEEE J Emerg Selec Topics Circuits Syst. 2020;10(4):419–432. https://doi.org/10.1109/JETCAS.2020.3040300.
Tauqeer H, Iqbal MM, Ali A, Zaman S, Chaudhry MU. Cyberattacks detection in IoMT using machine learning techniques. J Comp & Biomed Inform. 2022;4(1):13–20. https://doi.org/10.56979/401/2022/80.
Chohan MN, Haider U, Ayub MY, Shoukat H, Bhatla TK, Ul Hassan MF. Detection of cyber attacks using machine learning based intrusion detection system for IoT based smart cities. EAI Endorsed Trans Smart Cities. 202;7(2). https://doi.org/10.4108/eetsc.3222.
Kaur MJ. A comprehensive survey on architecture for Big Data processing in mobile edge computing environments. In: F. Al-Turjman (Ed.), Edge Computing: From Hype to Reality. Cham: Springer; 2019; pp. 33–49. https://doi.org/10.1007/978-3-319-99061-3_3.
Kaur K, Kaur K. A survey on service oriented architecture on big data, cloud computing and IoT. International J Adv Res Comp Sci. 2024;8(3):1021–1025. https://doi.org/10.26483/ijarcs.v8i3.3148.
Shafiq M, Tian Z, Bashirt AK, Du X, Guizani M. CorrAUC: A malicious Bot-IoT traffic detection method in IoT network using machine learning techniques. IEEE Int Things J. 2020;8(5);3242–3254. https://doi.org/10.1109/JIOT.2020.3002255.
Polat H, Polat O, Çetin A. Detecting DDoS attacks in software-defined networks through feature selection methods and machine learning models. Sustain. 2020;12(3):1035. https://doi.org/10.3390/su12031035.
Idouglid L, Tkatek S, Elfayq K, Guezzaz A. A novel anomaly detection model for the industrial Internet of Things using machine learning techniques. Radioelectr Comp Syst. 2024;1:143–151. https://doi.org/10.32620/reks.2024.1.12.
Lan B, Yu S. Detecting cyber attacks in industrial control systems using duration-aware representation learning. In: Proceedings of the 2024 10th International Conference on Computing and Artificial Intelligence. New York: Association for Computing Machinery; 2024; pp. 379–386. https://doi.org/10.1145/3669754.3669812.
Popoola SI, Ande R, Adebisi B, Gui G, Hammoudeh M, Jogunola O. Federated deep learning for zero-day botnet attack detection in IoT edge devices. IEEE Int Things J. 2021;9(5):3930–3944. https://doi.org/10.1109/JIOT.2021.3100755.
Cervera LC, Deocareza M. A scoping review on the use of machine learning for fraud detection in online banking systems. 2024. https://www.researchgate.net/publication/387173721_A_Scoping_Review_on_the_Use_of_Machine_Learning_for_Fraud_Detection_in_Online_Banking_Systems.
Ravi V, Alazab M, Soman K, Poornachandran P, Venkatraman S. Robust intelligent malware detection using deep learning. IEEE Access. 2019;7:46717–46738. https://doi.org/10.1109/ACCESS.2019.2906934.
Aldossari MQ, Sidorova A. Consumer Acceptance of Internet of Things (IoT): Smart Home Context. J Comp Inf Sys. 2018;60(6),507–517. https://doi.org/10.1080/08874417.2018.1543000
Nour M, Marwa K, Essam D, Janicke H. Federated TON_IoT windows datasets for evaluating AI-based security applications. 2020. https://doi.org/10.48550/arXiv.2010.08522.

Cite this article as:
Asgarov K. Anomaly Detection Techniques for Securing IoT Endpoints: A Machine Learning Approach. Premier Journal of Science 2025;14:100175