Hybrid ResNet-1D-BiGRU with Multi-Head Attention for Cyberattack Detection in Industrial IoT Environments
Pith reviewed 2026-05-10 18:33 UTC · model grok-4.3
The pith
A hybrid model stacking ResNet-1D, bidirectional GRU, and multi-head attention detects IIoT cyberattacks with 98.71 percent accuracy and 0.0001-second latency.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The hybrid ResNet-1D-BiGRU with Multi-Head Attention model, after SMOTE balancing on EdgeHoTset, reaches 98.71 percent accuracy, 0.0417 percent loss, and 0.0001 sec/instance inference latency; the identical architecture tested on CICIoV2024 yields 99.99 percent accuracy, 0.0028 loss, zero false-positive rate, and 0.00014 sec/instance latency, surpassing all compared existing methods on both collections.
What carries the argument
The stacked hybrid network of 1D residual blocks for spatial feature extraction, bidirectional GRU layers for temporal sequence modeling in both directions, and multi-head attention for dynamic weighting of salient feature channels before final classification.
If this is right
- Inference times below 0.0002 seconds per sample make continuous, on-device monitoring feasible inside resource-limited IIoT gateways without adding perceptible delay to control loops.
- Zero false-positive rate on one benchmark implies the model can flag attacks while generating almost no spurious alerts that would burden human operators.
- Consistent superiority over prior methods on two independent datasets indicates that the spatial-temporal-attention combination extracts more discriminative signatures than simpler convolutional, recurrent, or attention-only baselines.
- SMOTE balancing during training enables the network to learn rare attack classes without requiring additional real-world attack traces.
Where Pith is reading between the lines
- The same layered architecture could be retrained on sensor-stream data from other domains such as power-grid anomaly detection or vehicle network security.
- Performance on encrypted or zero-day traffic would need separate validation because the current benchmarks consist of labeled, unencrypted flows with known attack signatures.
- Embedding the detector inside existing IIoT protocol stacks could allow smaller operators to add advanced threat monitoring without building large labeled datasets from scratch.
Load-bearing premise
The two chosen public datasets contain traffic patterns and attack distributions that match those of real deployed industrial IoT systems, and SMOTE-generated samples do not introduce artifacts that artificially inflate accuracy on the held-out test portions.
What would settle it
Applying the trained model to a fresh collection of live industrial IoT packet traces that contain attack variants absent from both EdgeHoTset and CICIoV2024 and measuring whether accuracy falls below 95 percent or false-positive rate rises above 2 percent.
Figures
read the original abstract
This study introduces a hybrid deep learning model for intrusion detection in Industrial IoT (IIoT) systems, combining ResNet-1D, BiGRU, and Multi-Head Attention (MHA) for effective spatial-temporal feature extraction and attention-based feature weighting. To address class imbalance, SMOTE was applied during training on the EdgeHoTset dataset. The model achieved 98.71% accuracy, a loss of 0.0417%, and low inference latency (0.0001 sec /instance), demonstrating strong real-time capability. To assess generalizability, the model was also tested on the CICIoV2024 dataset, where it reached 99.99% accuracy and F1-score, with a loss of 0.0028, 0 % FPR, and 0.00014 sec/instance inference time. Across all metrics and datasets, the proposed model outperformed existing methods, confirming its robustness and effectiveness for real-time IoT intrusion detection.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a hybrid deep learning model integrating ResNet-1D, BiGRU, and Multi-Head Attention for cyberattack detection in Industrial IoT systems. SMOTE is used to address class imbalance on the EdgeHoTset dataset, yielding 98.71% accuracy and 0.0001 sec/instance latency. Evaluation on CICIoV2024 shows 99.99% accuracy, 0% FPR, and similar low latency, with claims of outperforming prior methods.
Significance. If validated without data leakage and with proper generalization, the low-latency hybrid model could advance real-time intrusion detection in resource-constrained IIoT environments. The dual-dataset evaluation is a positive step, but the significance hinges on whether the results reflect genuine improvements rather than dataset artifacts or improper validation.
major comments (3)
- [Methods (SMOTE application)] The abstract states SMOTE was applied 'during training' on EdgeHoTset, but no explicit statement confirms that the train/test split preceded SMOTE application. Without this, synthetic samples could have leaked into the test set, potentially inflating the 98.71% accuracy and 0% FPR. This is load-bearing for the performance claims.
- [Experimental Results] Details on hyperparameter tuning (e.g., learning rate, number of heads in MHA, hidden sizes) and whether test data was used in tuning are absent. The free parameters listed include these, raising risk of overfitting to the specific datasets.
- [Evaluation and Generalizability] No cross-dataset transfer experiments, adversarial robustness tests, or analysis of how well EdgeHoTset and CICIoV2024 represent real industrial IoT traffic distributions are provided. This undermines the claim of 'robustness ... for real-time IoT intrusion detection'.
minor comments (2)
- [Abstract] The loss is reported as 0.0417% which seems unusually low; clarify if this is cross-entropy loss or percentage.
- [Notation] Ensure consistent use of terms like 'FPR' and full expansion on first use.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive feedback on our manuscript. We have addressed each of the major comments by revising the paper to clarify the SMOTE application process, provide hyperparameter tuning details, and discuss generalizability limitations. Our responses are provided point-by-point below.
read point-by-point responses
-
Referee: [Methods (SMOTE application)] The abstract states SMOTE was applied 'during training' on EdgeHoTset, but no explicit statement confirms that the train/test split preceded SMOTE application. Without this, synthetic samples could have leaked into the test set, potentially inflating the 98.71% accuracy and 0% FPR. This is load-bearing for the performance claims.
Authors: We thank the referee for highlighting this critical methodological detail. The full Methods section (Section 3.2) describes that the EdgeHoTset dataset was first partitioned into an 80/20 train/test split using stratified sampling to maintain class proportions, after which SMOTE was applied exclusively to the training set. No synthetic samples were generated or included in the test set. To remove any potential ambiguity in the abstract, we have revised it to read: 'The EdgeHoTset dataset was split into training and test sets prior to applying SMOTE exclusively to the training data.' We have also added an explicit paragraph in the Methods section outlining the exact sequence of operations and confirming the absence of leakage. revision: yes
-
Referee: [Experimental Results] Details on hyperparameter tuning (e.g., learning rate, number of heads in MHA, hidden sizes) and whether test data was used in tuning are absent. The free parameters listed include these, raising risk of overfitting to the specific datasets.
Authors: We appreciate the referee's concern about transparency in hyperparameter selection and the associated risk of overfitting. In the revised manuscript, we have inserted a new subsection 'Hyperparameter Optimization and Model Selection' under Experimental Setup. This subsection specifies the search ranges (learning rate: 1e-4 to 1e-2; number of MHA heads: 2, 4, 8; BiGRU hidden sizes: 64, 128, 256), the optimization method (grid search with 5-fold cross-validation performed solely on the training partition), and the final selected values. We explicitly state that the test set was never used during tuning or model selection, and we provide the complete list of chosen hyperparameters for reproducibility. revision: yes
-
Referee: [Evaluation and Generalizability] No cross-dataset transfer experiments, adversarial robustness tests, or analysis of how well EdgeHoTset and CICIoV2024 represent real industrial IoT traffic distributions are provided. This undermines the claim of 'robustness ... for real-time IoT intrusion detection'.
Authors: We acknowledge that cross-dataset transfer learning, adversarial robustness evaluations, and a quantitative comparison of dataset distributions against real-world IIoT traffic would provide stronger evidence of generalizability. Our current evaluation already demonstrates consistent high performance across two datasets with differing characteristics and attack profiles. In the revised manuscript we have added a 'Limitations and Future Directions' section that discusses these gaps, includes a qualitative comparison of traffic features to published IIoT benchmarks, and outlines planned follow-up experiments on transfer and adversarial settings. Because performing the additional experiments would require substantial new computation and data collection beyond the scope of this revision, we have addressed the comment through expanded discussion rather than new empirical results. revision: partial
Circularity Check
No circularity: purely empirical model evaluation with no derivation chain
full rationale
The paper proposes a hybrid neural architecture (ResNet-1D + BiGRU + Multi-Head Attention) and reports empirical accuracies, F1, loss, and latency on two public datasets after SMOTE oversampling during training. No mathematical derivation, first-principles prediction, or equation chain is claimed or present; performance figures are direct experimental outputs, not quantities that reduce to fitted parameters or self-citations by construction. The central claims rest on standard train/test evaluation rather than any self-definitional or load-bearing self-referential step, satisfying the self-contained benchmark criterion.
Axiom & Free-Parameter Ledger
free parameters (2)
- SMOTE oversampling ratio
- Model hyperparameters (learning rate, number of heads, hidden sizes)
axioms (1)
- domain assumption Network traffic features are sufficient to distinguish attacks from normal behavior.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
hybrid deep learning model ... combining ResNet-1D, BiGRU, and Multi-Head Attention (MHA) for effective spatial-temporal feature extraction
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Adaptive cyber-attack detection in iiot using attention-based LSTM-CNN models,
A. Gueriani, H. Kheddar, and A. C. Mazari, “Adaptive cyber-attack detection in iiot using attention-based LSTM-CNN models,” in2024 International Conference on Telecommunications and Intelligent Sys- tems (ICTIS). IEEE, 2024, pp. 1–6
work page 2024
-
[2]
M. A. Hossain and M. S. Islam, “Ensuring network security with a robust intrusion detection system using ensemble-based machine learning,”Array, vol. 19, p. 100306, 2023
work page 2023
-
[3]
Reinforcement-learning-based intrusion detection in communication networks: A review,
H. Kheddar, D. W. Dawoud, A. I. Awad, Y . Himeur, and M. K. Khan, “Reinforcement-learning-based intrusion detection in communication networks: A review,”IEEE Communications Surveys & Tutorials, 2024
work page 2024
-
[4]
Kubat,Fundamentals of Artificial Intelligence: Problem Solving and Automated Reasoning
M. Kubat,Fundamentals of Artificial Intelligence: Problem Solving and Automated Reasoning. McGraw-Hill Education, 2023. TABLE II: Comparing the Best Practices for Multiclass Classification on the Edge-IIoTset Dataset with Performance Metrics for the Suggested ResNet-1D-BiGRU-MHA Model. Work Model Dataset Acc (%) Loss Pr (%) Rc (%) F1 (%) FPR (%) Inf time ...
work page 2023
-
[5]
DNN CICIoV2024 96 ✗ 83 76 78 ✗ ✗
-
[6]
BiGRU-LSTM EdgeIIoT 98.32 ✗ 98.78 97.22 ✗ ✗ ✗
-
[7]
CNN-LSTM-ViT CICIoV2024 99.78 ✗ ✗ ✗ 99.65 1.2 0.0213 Presented ResNet-1D-BiGRU-MHA EdgeIIoT 98.71 0.0417 98.71 98.70 98.71 0.002 0.0001 CICIoV2024 99.99 0.0028 99.99 99.99 99.99 0.0000 0.00014 TABLE III: Performance of different variants of the proposed models in multiclass classification. Case number Model N. of Att heads Dropout (%) Accuracy (%) Loss(%)...
-
[8]
B. Sharma, L. Sharma, C. Lal, and S. Roy, “Explainable artificial intelligence for intrusion detection in iot networks: A deep learning based approach,”Expert Systems with Applications, vol. 238, p. 121751, 2024
work page 2024
-
[9]
H. Kheddar, “Transformers and large language models for efficient intrusion detection systems: A comprehensive survey,”Information Fusion, vol. 124, p. 103347, 2025
work page 2025
-
[10]
Iot intrusion detection model based on gated recurrent unit and residual network,
G. Zhao, C. Ren, J. Wang, Y . Huang, and H. Chen, “Iot intrusion detection model based on gated recurrent unit and residual network,” Peer-to-Peer Networking and Applications, vol. 16, no. 4, pp. 1887– 1899, 2023
work page 2023
-
[11]
A survey of neural networks usage for intrusion detection systems,
A. Drewek-Ossowicka, M. Pietrołaj, and J. Rumi ´nski, “A survey of neural networks usage for intrusion detection systems,”Journal of Ambient Intelligence and Humanized Computing, vol. 12, no. 1, pp. 497–514, 2021
work page 2021
-
[12]
Multi-view correlation-aware network traffic detection on flow hypergraph,
J. Zhou, W. Fu, H. Song, S. Yu, Q. Xuan, and X. Yang, “Multi-view correlation-aware network traffic detection on flow hypergraph,”arXiv preprint arXiv:2501.08610, 2025
-
[13]
M. L. Hernandez-Jaimes, A. Martinez-Cruz, K. A. Ram ´ırez-Guti´errez, and A. Morales-Reyes, “Network traffic inspection to enhance anomaly detection in the internet of things using attention-driven deep learning,”Integration, p. 102398, 2025
work page 2025
-
[14]
Pso-ga hyper- parameter optimized resnet-bigru based intrusion detection method,
Z. Xia, S. He, C. Liu, Y . Liu, X. Yang, and H. Bu, “Pso-ga hyper- parameter optimized resnet-bigru based intrusion detection method,” IEEE Access, 2024
work page 2024
-
[15]
An explainable and re- silient intrusion detection system for industry 5.0,
D. Javeed, T. Gao, P. Kumar, and A. Jolfaei, “An explainable and re- silient intrusion detection system for industry 5.0,”IEEE Transactions on Consumer Electronics, vol. 70, no. 1, pp. 1342–1350, 2023
work page 2023
-
[16]
Resnest-bigru: An intrusion detection model based on internet of things
Y . Xiang, D. Li, X. Meng, C. Dong, and G. Qin, “Resnest-bigru: An intrusion detection model based on internet of things.”Computers, Materials & Continua, vol. 79, no. 1, 2024
work page 2024
-
[17]
Cyber threat detection in iiot and iomt using dnn-gru with multi-head attention,
A. Gueriani, H. Kheddar, and A. C. Mazari, “Cyber threat detection in iiot and iomt using dnn-gru with multi-head attention,” in2025 International Conference on Research in Computing at Feminine (RIF). IEEE, 2025, pp. 1–8
work page 2025
-
[18]
Explainable bilstm-mha-based ids for iot using shap and zero- day attack detection,
——, “Explainable bilstm-mha-based ids for iot using shap and zero- day attack detection,” in2025 International Conference on Artificial Intelligence and Innovative Applications (AIIA). IEEE, 2025, pp. 1–8
work page 2025
-
[19]
A robust cross-domain ids using bigru-lstm-attention for medical and industrial iot security,
A. Gueriani, H. Kheddar, A. C. Mazari, and M. C. Ghanem, “A robust cross-domain ids using bigru-lstm-attention for medical and industrial iot security,”ICT Express, 2025
work page 2025
-
[20]
Se-enhanced vit and bilstm-based intrusion detection for secure iiot and iomt environments,
A. Gueriani, H. Kheddar, A. C. Mazari, S. Sagiroglu, and O. Ceran, “Se-enhanced vit and bilstm-based intrusion detection for secure iiot and iomt environments,” in2025 18th International Conference on Information Security and Cryptology (ISCT ¨urkiye). IEEE, 2025, pp. 1–6
work page 2025
-
[21]
A. Vaswani, “Attention is all you need,”Advances in Neural Infor- mation Processing Systems, 2017
work page 2017
-
[22]
M. A. Qathrady, S. Ullah, M. S. Alshehri, J. Ahmad, S. Almakdi, S. M. Alqhtani, M. A. Khan, and B. Ghaleb, “Sacnn-ids: A self- attention convolutional neural network for intrusion detection in industrial internet of things,”CAAI Transactions on Intelligence Technology, vol. 9, no. 6, pp. 1398–1411, 2024
work page 2024
-
[23]
H. R. Sayegh, W. Dong, and A. M. Al-madani, “Enhanced intrusion detection with lstm-based model, feature selection, and smote for imbalanced data,”Applied Sciences, vol. 14, no. 2, p. 479, 2024
work page 2024
-
[24]
M. A. Ferrag, O. Friha, D. Hamouda, L. Maglaras, and H. Janicke, “Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications for centralized and federated learning,”IEEE Access, vol. 10, pp. 40 281–40 306, 2022
work page 2022
-
[25]
Ciciov2024: Advancing realistic ids approaches against dos and spoofing attack in iov can bus,
E. Carlos Pinto Neto, H. Taslimasa, S. Dadkhah, S. Iqbal, P. Xiong, T. Rahman, and A. Ghorbani, “Ciciov2024: Advancing realistic ids approaches against dos and spoofing attack in iov can bus,”Hamideh and Dadkhah, Sajjad and Iqbal, Shahrear and Xiong, Pulei and Rahman, Taufiq and Ghorbani, Ali, Ciciov2024: Advancing Realistic Ids Approaches Against Dos and...
work page 2024
-
[26]
Deep reinforcement learning for intrusion detection in iot: A survey,
A. Gueriani, H. Kheddar, and A. C. Mazari, “Deep reinforcement learning for intrusion detection in iot: A survey,” in2023 2nd International Conference on Electronics, Energy and Measurement (IC2EM), vol. 1. IEEE, 2023, pp. 1–7
work page 2023
-
[27]
H. Kheddar, M. Hemis, Y . Himeur, D. Meg ´ıas, and A. Amira, “Deep learning for steganalysis of diverse data types: A review of methods, taxonomy, challenges and future directions,”Neurocomputing, p. 127528, 2024
work page 2024
-
[28]
Deep transfer learning for automatic speech recognition: Towards better generalization,
H. Kheddar, Y . Himeur, S. Al-Maadeed, A. Amira, and F. Bensaali, “Deep transfer learning for automatic speech recognition: Towards better generalization,”Knowledge-Based Systems, vol. 277, p. 110851, 2023
work page 2023
-
[29]
C. Dunn, N. Moustafa, and B. Turnbull, “Robustness evaluations of sustainable machine learning models against data poisoning attacks in the internet of things,”Sustainability, vol. 12, no. 16, p. 6434, 2020
work page 2020
-
[30]
An intrusion detection system for edge-envisioned smart agriculture in extreme environment,
D. Javeed, T. Gao, M. S. Saeed, and P. Kumar, “An intrusion detection system for edge-envisioned smart agriculture in extreme environment,”IEEE Internet of Things Journal, 2023
work page 2023
-
[31]
A hybrid deep learning framework for multi-modal intrusion detection in internet of vehicles,
N. A. Jailani, R. Kumar, and S. Tyagi, “A hybrid deep learning framework for multi-modal intrusion detection in internet of vehicles,” in2025 3rd International Conference on Sustainable Computing and Data Communication Systems (ICSCDS). IEEE, 2025, pp. 900–906
work page 2025
-
[32]
Accelerating iov intrusion detection: Bench- marking gpu-accelerated vs cpu-based ml libraries,
F. C ¸ olhak, H. Cos ¸kun, T. N. R. Cyrille, T. Hoxa, M. ˙I. Ecevit, and M. N. Aydın, “Accelerating iov intrusion detection: Bench- marking gpu-accelerated vs cpu-based ml libraries,”arXiv preprint arXiv:2504.01905, 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.