pith. machine review for the scientific record. sign in

arxiv: 2604.21623 · v1 · submitted 2026-04-23 · 💻 cs.CR · cs.LG

Recognition: unknown

A-THENA: Early Intrusion Detection for IoT with Time-Aware Hybrid Encoding and Network-Specific Augmentation

Dimitra I. Kaklamani, Iakovos S. Venieris, Ioannis Panopoulos, Maria Lamprini A. Bartsioka, Sokratis Nikolaidis, Stylianos I. Venieris

Authors on Pith no claims yet

Pith reviewed 2026-05-09 21:17 UTC · model grok-4.3

classification 💻 cs.CR cs.LG
keywords IoT intrusion detectionearly detectiontime-aware encodingtransformer modeldata augmentationnetwork securityedge computingcybersecurity
0
0 comments X

The pith

A transformer with time-aware encoding detects IoT intrusions early and with near-zero errors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces A-THENA, an early intrusion detection system for IoT that builds on a transformer architecture. It adds Time-Aware Hybrid Encoding to fold packet timestamps directly into the model's processing so that timing patterns in traffic become visible to the detector. A separate Network-Specific Augmentation step generates training examples that reflect the quirks of particular IoT networks. The authors report that the combined changes produce higher accuracy than standard positional encodings, feature-only models, and other time-aware methods across three public datasets, while keeping false alarms and missed threats near zero. The system also runs with low latency and memory on a Raspberry Pi, showing it can operate at the edge without heavy servers.

Core claim

A-THENA augments a Transformer-based architecture with a generalized Time-Aware Hybrid Encoding (THE) that integrates packet timestamps to capture temporal dynamics essential for accurate early threat detection, together with a Network-Specific Augmentation (NA) pipeline that enhances model robustness and generalization. On the CICIoT23-WEB, MQTT-IoT-IDS2020, and IoTID20 datasets the approach delivers higher accuracy than traditional positional encodings, the strongest feature-based models, leading time-aware alternatives, and related methods, while producing near-zero false alarms and false negatives. Deployment tests on the Raspberry Pi Zero 2 W confirm that real-time detection is feasible

What carries the argument

Time-Aware Hybrid Encoding (THE), which augments positional encodings with packet timestamp information inside a transformer, supported by a Network-Specific Augmentation (NA) pipeline that creates synthetic traffic examples matched to real IoT network traits.

Load-bearing premise

The approach rests on the premise that packet timestamps supply timing information that standard encodings miss and that network-specific augmentation strengthens the model without introducing biases or artifacts that hurt performance on unseen real IoT traffic.

What would settle it

Running the model on a fresh IoT traffic dataset collected from a different environment and finding that accuracy falls to the level of baseline transformers without time encoding, or that false-positive rates rise above near zero, would undermine the central performance claims.

Figures

Figures reproduced from arXiv: 2604.21623 by Dimitra I. Kaklamani, Iakovos S. Venieris, Ioannis Panopoulos, Maria Lamprini A. Bartsioka, Sokratis Nikolaidis, Stylianos I. Venieris.

Figure 1
Figure 1. Figure 1: The A-THENA system architecture. (Top) Transformer encoder ingesting raw packet sequences (𝐹 ) with Time-Aware (TA) encodings from timestamps (𝑇 ): TA Sinusoidal and TA Fourier are added to the input projection, while TA RoPE modifies multi-head attention. The Early Detection Loss (EDL) enforces early classification. (Bottom) The Time-Aware Hybrid Encoding (THE) framework dynamically selects the optimal en… view at source ↗
Figure 2
Figure 2. Figure 2: A-THENA’s Training & Evaluation workflow. Modules are color-coded by functional role: data preparation (brown), model creation (purple), offline augmentation (green), THE variant selection (yellow), online network-specific augmentation (blue), and the EDL-based optimization step (red). Numbered circles denote the execution order within each stage, and arrows show data flow. traffic. Upon the arrival or dep… view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of standard index-based positional encodings (top row) against the proposed time-aware variants applied to SSH [PITH_FULL_IMAGE:figures/full_fig_p018_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Prediction confidence trajectories as the number of observed packets ( [PITH_FULL_IMAGE:figures/full_fig_p027_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Sensitivity analysis of the confidence threshold ( [PITH_FULL_IMAGE:figures/full_fig_p032_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Attention weight heatmaps comparing A-THENA (TA RoPE) and RoPE on two representative flows from IoTID20. (Left pair) Host & Port Scan flow (𝑛 = 15). (Right pair) Mirai HTTP Flooding flow (𝑛 = 30). in IoTID20 lead to up to 33% of flows failing to meet the threshold within 30 packets. These cases revert to full-sequence predictions, increasing delay and slightly reducing accuracy. Overall, 𝜏 = 0.95 provides … view at source ↗
Figure 7
Figure 7. Figure 7: Impact of augmentation, EDL, and quantization on [PITH_FULL_IMAGE:figures/full_fig_p033_7.png] view at source ↗
read the original abstract

The proliferation of Internet of Things (IoT) devices has significantly expanded attack surfaces, making IoT ecosystems particularly susceptible to sophisticated cyber threats. To address this challenge, this work introduces A-THENA, a lightweight early intrusion detection system (EIDS) that significantly extends preliminary findings on time-aware encodings. A-THENA employs an advanced Transformer-based architecture augmented with a generalized Time-Aware Hybrid Encoding (THE), integrating packet timestamps to effectively capture temporal dynamics essential for accurate and early threat detection. The proposed system further employs a Network-Specific Augmentation (NA) pipeline, which enhances model robustness and generalization. We evaluate A-THENA on three benchmark IoT intrusion detection datasets-CICIoT23-WEB, MQTT-IoT-IDS2020, and IoTID20-where it consistently achieves strong performance. Averaged across all three datasets, it improves accuracy by 6.88 percentage points over the best-performing traditional positional encoding, 3.69 points over the strongest feature-based model, 6.17 points over the leading time-aware alternatives, and 5.11 points over related models, while achieving near-zero false alarms and false negatives. To assess real-world feasibility, we deploy A-THENA on the Raspberry Pi Zero 2 W, demonstrating its ability to perform real-time intrusion detection with minimal latency and memory usage. These results establish A-THENA as an agile, practical, and highly effective solution for securing IoT networks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces A-THENA, a Transformer-based early intrusion detection system (EIDS) for IoT networks. It proposes a Time-Aware Hybrid Encoding (THE) that incorporates packet timestamps to capture temporal dynamics, combined with a Network-Specific Augmentation (NA) pipeline for improved robustness. The system is evaluated on three public IoT IDS datasets (CICIoT23-WEB, MQTT-IoT-IDS2020, IoTID20), reporting average accuracy gains of 6.88 pp over traditional positional encodings, 3.69 pp over feature-based models, 6.17 pp over time-aware alternatives, and 5.11 pp over related models, with near-zero FPR/FNR. It also demonstrates real-time deployment on a Raspberry Pi Zero 2 W with low latency and memory footprint.

Significance. If the early-detection claims hold under prefix-based evaluation, the work would offer a practical, lightweight advance for IoT security by combining timestamp-aware encoding with targeted augmentation in a deployable Transformer model. The reported accuracy margins and hardware feasibility are notable for resource-constrained environments, though the absence of statistical tests and prefix-specific protocols limits immediate impact.

major comments (2)
  1. [Evaluation / Methodology] Evaluation section (and associated methodology): The manuscript does not describe whether input sequences to the Transformer are full flows/sessions or truncated prefixes of flows. Since the three benchmark datasets consist of complete labeled flows, the reported accuracy improvements and near-zero error rates may reflect improved overall classification rather than the claimed early detection on partial traffic; this directly affects the central EIDS positioning and the assumption that THE captures temporal dynamics for timely detection.
  2. [Results] Results and experimental setup: Averaged gains (e.g., 6.88 pp over positional encodings) are presented without error bars, statistical significance tests, or details on train/test splits, class imbalance handling, or how NA parameters were tuned. This makes it difficult to verify robustness of the gains or rule out overfitting to complete-flow statistics.
minor comments (2)
  1. [Abstract] Abstract and introduction: The phrase 'near-zero false alarms and false negatives' should be replaced with precise per-dataset FPR/FNR values for clarity.
  2. [Methodology] Notation: Define the exact formulation of THE (how timestamps are fused with positional encodings) and the NA pipeline steps earlier in the text to improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive feedback on our manuscript. We address each of the major comments in detail below, providing clarifications and outlining the revisions we will make to strengthen the paper.

read point-by-point responses
  1. Referee: [Evaluation / Methodology] Evaluation section (and associated methodology): The manuscript does not describe whether input sequences to the Transformer are full flows/sessions or truncated prefixes of flows. Since the three benchmark datasets consist of complete labeled flows, the reported accuracy improvements and near-zero error rates may reflect improved overall classification rather than the claimed early detection on partial traffic; this directly affects the central EIDS positioning and the assumption that THE captures temporal dynamics for timely detection.

    Authors: We appreciate the referee's point on the need for explicit description of the input sequences. The current manuscript does not detail this aspect in the Evaluation section. To resolve this, we will revise the manuscript to clearly describe that our evaluation uses truncated prefixes of the flows to emulate early detection scenarios. We will specify the prefix lengths used (e.g., first 5, 10, 20 packets) and how labels are assigned based on the prefix content. This will directly support the EIDS claims and demonstrate that THE effectively captures temporal dynamics from partial traffic. We will also include results showing performance as a function of prefix length. revision: yes

  2. Referee: [Results] Results and experimental setup: Averaged gains (e.g., 6.88 pp over positional encodings) are presented without error bars, statistical significance tests, or details on train/test splits, class imbalance handling, or how NA parameters were tuned. This makes it difficult to verify robustness of the gains or rule out overfitting to complete-flow statistics.

    Authors: We agree that the results section would benefit from additional statistical rigor and experimental details. In the revised version, we will add error bars representing standard deviation over 5 independent runs with different random seeds. We will include statistical significance tests (e.g., Wilcoxon signed-rank test) for the reported accuracy improvements. Details on the train/test splits (stratified 70/30 split to handle imbalance) and class imbalance handling via the NA pipeline will be provided. Additionally, we will describe the tuning process for NA parameters, including the search space and selected values. These changes will enhance the verifiability and robustness of our findings. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical evaluation on external benchmarks

full rationale

The paper proposes A-THENA as an EIDS using Transformer architecture with Time-Aware Hybrid Encoding (THE) and Network-Specific Augmentation (NA). It reports averaged accuracy gains on three public IoT IDS datasets (CICIoT23-WEB, MQTT-IoT-IDS2020, IoTID20) against external baselines including positional encodings, feature-based models, and time-aware alternatives. No mathematical derivation chain, equations, or 'predictions' are described that reduce by construction to fitted parameters or self-definitions. The reference to extending 'preliminary findings' is a minor self-citation that does not bear the load of the central empirical claims, which remain independently testable on the cited public datasets. No uniqueness theorems, ansatzes, or renamings of known results are invoked in a circular manner.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 2 invented entities

The central claim rests on the effectiveness of newly proposed THE and NA techniques, which are inventions without independent verification beyond the paper's experiments. Standard ML training assumptions and dataset representativeness apply.

free parameters (2)
  • Transformer model hyperparameters
    Learning rate, number of layers, attention heads, and other training parameters chosen or optimized on the datasets.
  • Augmentation parameters in NA pipeline
    Parameters controlling how network-specific variations are generated, likely tuned for the target IoT networks.
axioms (2)
  • domain assumption The three benchmark datasets (CICIoT23-WEB, MQTT-IoT-IDS2020, IoTID20) are representative of real-world IoT intrusion scenarios.
    Relied upon for the validity of performance claims and generalization.
  • domain assumption Packet timestamps contain predictive temporal dynamics for distinguishing intrusions from normal traffic.
    Core justification for introducing time-aware hybrid encoding.
invented entities (2)
  • Time-Aware Hybrid Encoding (THE) no independent evidence
    purpose: Integrate packet timestamps into the Transformer to capture temporal dynamics for early threat detection.
    New encoding method proposed as part of A-THENA.
  • Network-Specific Augmentation (NA) no independent evidence
    purpose: Generate augmented data to improve robustness and generalization across different IoT network conditions.
    New augmentation pipeline introduced to enhance the model.

pith-pipeline@v0.9.0 · 5604 in / 1769 out tokens · 48680 ms · 2026-05-09T21:17:52.545041+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

80 extracted references · 69 canonical work pages

  1. [1]

    Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. TensorFlow: A System for Large-Scale Machine Learning. In12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 265–283

  2. [2]

    Iram Abrar, Zahrah Ayub, Faheem Masoodi, and Alwi M Bamhdi. 2020. A Machine Learning Approach for Intrusion Detection System on NSL-KDD Dataset. In2020 International Conference on Smart Electronics and Communication (ICOSEC). IEEE, 919–924. doi:10.1109/icosec49089.2020.9215232

  3. [3]

    Tanwir Ahmad and Dragos Truscan. 2024. Early Detection with Explainability of Network Attacks Using Deep Learning. In2024 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW). IEEE, 161–167. doi:10.1109/icstw60967.2024.00040

  4. [4]

    Tanwir Ahmad, Dragos Truscan, and Jüri Vain. 2023. Preliminary Results in Using Attention for Increasing Attack Identification Efficiency. In2023 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW). IEEE, 159–164. doi:10.1109/icstw58534.2023.00038

  5. [5]

    Tanwir Ahmad, Dragos Truscan, Juri Vain, and Ivan Porres. 2022. Early Detection of Network Attacks Using Deep Learning. In2022 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW). IEEE, 30–39. doi:10.1109/icstw55395.2022.00020

  6. [6]

    2013.Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of Flow Information

    Paul Aitken. 2013.Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of Flow Information. Technical Report. RFC Editor. doi:10.17487/rfc7011

  7. [7]

    Saleh Alabdulwahab, Young-Tak Kim, and Yunsik Son. 2024. Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGAN.Sensors24, 22 (Nov. 2024), 7389. doi:10.3390/s24227389

  8. [8]

    Muhannad Almohaimeed and Faisal Albalwy. 2024. Enhancing IoT Network Security Using Feature Selection for Intrusion Detection Systems. Applied Sciences14, 24 (Dec. 2024), 11966. doi:10.3390/app142411966

  9. [9]

    Khashan, and Nour M

    Ayoob Almotairi, Samer Atawneh, Osama A. Khashan, and Nour M. Khafajah. 2024. Enhancing intrusion detection in IoT networks using machine learning-based feature selection and ensemble models.Systems Science & Control Engineering12, 1 (March 2024), 2321381. doi:10.1080/21642583. 2024.2321381

  10. [10]

    Philippe Biondi. 2024. Scapy: Packet Manipulation Tool. https://scapy.net. Accessed: April 24, 2026

  11. [11]

    Nguyen Kim Hai Bui, Nguyen Duy Chien, Péter Kovács, and Gergő Bognár. 2025. Transformer Encoder and Multi-Features Time2Vec for Financial Prediction. In2025 33rd European Signal Processing Conference (EUSIPCO). IEEE, 1682–1686. doi:10.23919/eusipco63237.2025.11226721

  12. [12]

    Christian Callegari, Stefano Giordano, and Michele Pagano. 2024. A Real Time Deep Learning Based Approach for Detecting Network Attacks.Big Data Research36 (May 2024), 100446. doi:10.1016/j.bdr.2024.100446

  13. [13]

    Taki Eddine Toufik Djaidja, Bouziane Brik, Sidi Mohammed Senouci, Abdelwahab Boualouache, and Yacine Ghamri-Doudane. 2024. Early Network Intrusion Detection Enabled by Attention Mechanisms and RNNs.IEEE Transactions on Information Forensics and Security19 (2024), 7783–7793. doi:10.1109/tifs.2024.3441862 Manuscript submitted to ACM 36 I. Panopoulos et al

  14. [14]

    Armando Domi, Christos Zonios, Giorgos Tatsis, Anastasios Drosou, and Dimitrios Tzovaras. 2025. NetPacketformer: Real-Time, Context-Aware Network Intrusion Detection with Transformers. In2025 IEEE International Conference on Cyber Security and Resilience (CSR). IEEE, 687–692. doi:10.1109/csr64739.2025.11130136

  15. [15]

    Yifan Fan, Hao Ma, Yiying Zhang, Siwei LI, Xiaoyan Guo, and Ben Wang. 2025. A DDoS attack detection method based on improved transformer and temporal feature enhancement.The Journal of Supercomputing81, 8 (June 2025). doi:10.1007/s11227-025-07440-2

  16. [16]

    Tofael Ahmed

    Kaniz Farhana, Maqsudur Rahman, and Md. Tofael Ahmed. 2020. An intrusion detection system for packet and flow based networks using deep neural network approach.International Journal of Electrical and Computer Engineering (IJECE)10, 5 (Oct. 2020), 5514. doi:10.11591/ijece.v10i5.pp5514-5525

  17. [17]

    Novoa, Manuel F

    Diego Fernandez, Laura Vigoya, Fidel Cacheda, Francisco J. Novoa, Manuel F. Lopez-Vizcaino, and Victor Carneiro. 2018. A Practical Application of a Dataset Analysis in an Intrusion Detection System. In2018 IEEE 17th International Symposium on Network Computing and Applications (NCA). IEEE, 1–5. doi:10.1109/nca.2018.8548316

  18. [18]

    Cordeiro, Merouane Debbah, Thierry Lestable, and Narinderjit Singh Thandi

    Mohamed Amine Ferrag, Mthandazo Ndhlovu, Norbert Tihanyi, Lucas C. Cordeiro, Merouane Debbah, Thierry Lestable, and Narinderjit Singh Thandi. 2024. Revolutionizing Cyber Threat Detection With Large Language Models: A Privacy-Preserving BERT-Based Lightweight Model for IoT/IIoT Devices.IEEE Access12 (2024), 23733–23750. doi:10.1109/access.2024.3363469

  19. [19]

    Xueying Han, Susu Cui, Song Liu, Chen Zhang, Bo Jiang, and Zhigang Lu. 2023. Network intrusion detection based on n-gram frequency and time-aware transformer.Computers & Security128 (May 2023), 103171. doi:10.1016/j.cose.2023.103171

  20. [20]

    Ramin Hasibi, Matin Shokri, and Mehdi Dehghan. 2019. Augmentation Scheme for Dealing with Imbalanced Network Traffic Classification Using Deep Learning. arXiv:1901.00204 [cs.NI] https://arxiv.org/abs/1901.00204

  21. [21]

    2021.Machine Learning Based IoT Intrusion Detection System: An MQTT Case Study (MQTT-IoT-IDS2020 Dataset)

    Hanan Hindy, Ethan Bayne, Miroslav Bures, Robert Atkinson, Christos Tachtatzis, and Xavier Bellekens. 2021.Machine Learning Based IoT Intrusion Detection System: An MQTT Case Study (MQTT-IoT-IDS2020 Dataset). Springer International Publishing, 73–84. doi:10.1007/978-3-030-64758-2_6

  22. [22]

    Dai, Matthew D

    Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Ian Simon, Curtis Hawthorne, Noam Shazeer, Andrew M. Dai, Matthew D. Hoffman, Monica Dinculescu, and Douglas Eck. 2019. Music Transformer: Generating Music with Long-Term Structure. In7th International Conference on Learning Representations, ICLR 2019

  23. [23]

    Lei Huang, Jie Qin, Yi Zhou, Fan Zhu, Li Liu, and Ling Shao. 2023. Normalization Techniques in Training DNNs: Methodology, Analysis and Application.IEEE Transactions on Pattern Analysis and Machine Intelligence45, 8 (Aug. 2023), 10173–10196. doi:10.1109/tpami.2023.3250241

  24. [24]

    Md Mahbub Islam, Tanwir Ahmad, and Dragos Truscan. 2023. An Evaluation of Transformer Models for Early Intrusion Detection in Cloud Continuum. In2023 IEEE International Conference on Cloud Computing Technology and Science (CloudCom). IEEE, 279–284. doi:10.1109/cloudcom59040.2023.00052

  25. [25]

    Jinquan Ji, Yu Cao, Yukun Ma, and Jianzhuo Yan. 2025. TITD: enhancing optimized temporal position encoding with time intervals and temporal decay in irregular time series forecasting.Applied Intelligence55, 6 (Feb. 2025). doi:10.1007/s10489-025-06293-9

  26. [26]

    Xi Jiang, Shinan Liu, Aaron Gember-Jacobson, Arjun Nitin Bhagoji, Paul Schmitt, Francesco Bronzino, and Nick Feamster. 2024. NetDiffusion: Network Data Augmentation Through Protocol-Constrained Traffic Generation.Proceedings of the ACM on Measurement and Analysis of Computing Systems8, 1 (Feb. 2024), 1–32. doi:10.1145/3639037

  27. [27]

    Seyed Mehran Kazemi, Rishab Goel, Sepehr Eghbali, Janahan Ramanan, Jaspreet Sahota, Sanjay Thakur, Stella Wu, Cathal Smyth, Pascal Poupart, and Marcus Brubaker. 2019. Time2Vec: Learning a Vector Representation of Time. arXiv:1907.05321 [cs.LG] https://arxiv.org/abs/1907.05321

  28. [28]

    Philip K

    Muhammad Almas Khan, Muazzam A. Khan, Sana Ullah Jan, Jawad Ahmad, Sajjad Shaukat Jamal, Awais Aziz Shah, Nikolaos Pitropakis, and William J. Buchanan. 2021. A Deep Learning-Based Intrusion Detection System for MQTT Enabled IoT.Sensors21, 21 (Oct. 2021), 7016. doi:10.3390/s21217016

  29. [29]

    Byunghyun Kim and Jae-Gil Lee. 2024. Continuous-Time Linear Positional Embedding for Irregular Time Series Forecasting. arXiv:2409.20092 [cs.LG] https://arxiv.org/abs/2409.20092

  30. [30]

    Taehoon Kim and Wooguil Pak. 2022. Early Detection of Network Intrusions Using a GAN-Based One-Class Classifier.IEEE Access10 (2022), 119357–119367. doi:10.1109/access.2022.3221400

  31. [31]

    Kingma and Jimmy Ba

    Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. InInternational Conference on Learning Representations (ICLR)

  32. [32]

    2022.Real-Time Systems: Design Principles for Distributed Embedded Applications

    Hermann Kopetz and Wilfried Steiner. 2022.Real-Time Systems: Design Principles for Distributed Embedded Applications. Springer Nature, Chapter Internet of Things, 325–341

  33. [33]

    Ghorbani

    Arash Habibi Lashkari, Gerard Draper-Gil, Mohammad Saiful Islam Mamun, and Ali A. Ghorbani. 2018. CICFlowMeter (formerly ISCXFlowMeter): a network traffic bi-flow generator and analyser for anomaly detection. GitHub repository, https://github.com/ISCX/CICFlowMeter. Version V4.0

  34. [34]

    Thi-Thu-Huong Le, Yustus Eko Oktian, and Howon Kim. 2022. XGBoost for Imbalanced Multiclass Classification-Based Industrial Internet of Things Intrusion Detection Systems.Sustainability14, 14 (July 2022), 8707. doi:10.3390/su14148707

  35. [35]

    Yang Li, Si Si, Gang Li, Cho-Jui Hsieh, and Samy Bengio. 2021. Learnable Fourier Features for Multi-dimensional Spatial Positional Encoding. In Advances in Neural Information Processing Systems, Vol. 34. 15816–15829

  36. [36]

    Chang Liu, Ruslan Antypenko, Iryna Sushko, and Oksana Zakharchenko. 2022. Intrusion Detection System After Data Augmentation Schemes Based on the VAE and CVAE.IEEE Transactions on Reliability71, 2 (June 2022), 1000–1010. doi:10.1109/tr.2022.3164877

  37. [37]

    Willian Tessaro Lunardi, Martin Andreoni Lopez, and Jean-Pierre Giacalone. 2023. ARCADE: Adversarially Regularized Convolutional Autoencoder for Network Anomaly Detection.IEEE Transactions on Network and Service Management20, 2 (June 2023), 1305–1318. doi:10.1109/tnsm.2022.3229706

  38. [38]

    Novoa, Diego Fernández, Víctor Carneiro, and Fidel Cacheda

    Manuel López-Vizcaíno, Francisco J. Novoa, Diego Fernández, Víctor Carneiro, and Fidel Cacheda. 2019. Early Intrusion Detection for OS Scan Attacks. In2019 IEEE 18th International Symposium on Network Computing and Applications (NCA). 1–5. doi:10.1109/nca.2019.8935067 Manuscript submitted to ACM A-THENA 37

  39. [39]

    Yu Ma, Zhining Liu, Chenyi Zhuang, Yize Tan, Yi Dong, Wenliang Zhong, and Jinjie Gu. 2022. Non-stationary Time-aware Kernelized Attention for Temporal Event Prediction. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’22). ACM, 1224–1232. doi:10.1145/3534678.3539470

  40. [40]

    Aggarwal, Y

    Liam Daly Manocchio, Siamak Layeghy, Wai Weng Lo, Gayan K. Kulatilleke, Mohanad Sarhan, and Marius Portmann. 2024. FlowTransformer: A transformer framework for flow-based network intrusion detection systems.Expert Systems with Applications241 (May 2024), 122564. doi:10.1016/j. eswa.2023.122564

  41. [41]

    Melícias, Tiago F

    Francisco S. Melícias, Tiago F. R. Ribeiro, Carlos Rabadão, Leonel Santos, and Rogério Luís De C. Costa. 2024. GPT and Interpolation-Based Data Augmentation for Multiclass Intrusion Detection in IIoT.IEEE Access12 (2024), 17945–17965. doi:10.1109/access.2024.3360879

  42. [42]

    Safaa Menssouri and El Mehdi Amhoud. 2025. A Conditional Tabular GAN-Enhanced Intrusion Detection System for Rare Attacks in IoT Networks. In 2025 IEEE International Conference on Communications Workshops (ICC Workshops). IEEE, 1918–1923. doi:10.1109/iccworkshops67674.2025.11162182

  43. [43]

    Yisroel Mirsky, Tomer Doitshman, Yuval Elovici, and Asaf Shabtai. 2018. Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection. InProceedings 2018 Network and Distributed System Security Symposium (NDSS 2018). Internet Society. doi:10.14722/ndss.2018.23204

  44. [44]

    Kohei Miyamoto, Chansu Han, Tao Ban, Takeshi Takahashi, and Jun’Ichi Takeuchi. 2024. Intrusion Detection Simplified: A Feature-free Approach to Traffic Classification Using Transformers. In2024 Annual Computer Security Applications Conference Workshops (ACSAC Workshops). IEEE, 20–29. doi:10.1109/acsacw65225.2024.00011

  45. [45]

    Alsubaei, and Abdulaleem Ali Almazroi

    Rasheed Mohammad, Faisal Saeed, Abdulwahab Ali Almazroi, Faisal S. Alsubaei, and Abdulaleem Ali Almazroi. 2024. Enhancing Intrusion Detection Systems Using a Deep Learning and Data Augmentation Approach.Systems12, 3 (March 2024), 79. doi:10.3390/systems12030079

  46. [46]

    Leila Mohammadpour, Teck Chaw Ling, Chee Sun Liew, and Alihossein Aryanfar. 2022. A Survey of CNN-Based Network Intrusion Detection. Applied Sciences12, 16 (Aug. 2022), 8162. doi:10.3390/app12168162

  47. [47]

    Daniel Moreno-Cartagena, Guillermo Cabrera-Vives, Pavlos Protopapas, Cristobal Donoso-Oliva, Manuel Pérez-Carrasco, and Martina Cádiz-Leyton

  48. [48]

    Positional Encodings for Light Curve Transformers: Playing with Positions and Attention

  49. [49]

    Ghorbani

    Euclides Carlos Pinto Neto, Sajjad Dadkhah, Raphael Ferreira, Alireza Zohourian, Rongxing Lu, and Ali A. Ghorbani. 2023. CICIoT2023: A Real-Time Dataset and Benchmark for Large-Scale Attacks in IoT Environment.Sensors23, 13 (June 2023), 5941. doi:10.3390/s23135941

  50. [50]

    Aleksander Ogonowski, Michał Żebrowski, Arkadiusz Ćwiek, Tobiasz Jarosiewicz, Konrad Klimaszewski, Adam Padee, Piotr Wasiuk, and Michał Wójcik. 2025. Preliminary study on artificial intelligence methods for cybersecurity threat detection in computer networks based on raw data packets.Computer Science26, SI (July 2025). doi:10.7494/csci.2025.26.si.7079

  51. [51]

    Abiodun Esther Omolara, Abdullah Alabdulatif, Oludare Isaac Abiodun, Moatsum Alawida, Abdulatif Alabdulatif, Wafa’ Hamdan Alshoura, and Humaira Arshad. 2022. The internet of things security: A survey encompassing unexplored areas and new insights.Computers & Security112 (Jan. 2022), 102494. doi:10.1016/j.cose.2021.102494

  52. [52]

    Evaluating Time Series Models for Urban Wastewater Management: Predictive Performance, Model Complexity and Resilience , url =

    Ioannis Panopoulos, Maria-Lamprini A. Bartsioka, Sokratis Nikolaidis, Stylianos I. Venieris, Dimitra I. Kaklamani, and Iakovos S. Venieris. 2025. Dynamic Temporal Positional Encodings for Early Intrusion Detection in IoT. In2025 10th International Conference on Smart and Sustainable Technologies (SpliTech). IEEE, 1–6. doi:10.23919/splitech65624.2025.11091651

  53. [53]

    Narendra Patwardhan, Stefano Marrone, and Carlo Sansone. 2023. Transformers in the Real World: A Survey on NLP Applications.Information14, 4 (April 2023), 242. doi:10.3390/info14040242

  54. [54]

    María Rodríguez, Álvaro Alesanco, Lorena Mehavilla, and José García. 2022. Evaluation of Machine Learning Techniques for Traffic Flow-Based Intrusion Detection.Sensors22, 23 (Nov. 2022), 9326. doi:10.3390/s22239326

  55. [55]

    Hyewon Ryu, Sara Yu, and Ki Yong Lee. 2023. TI-former: A Time-Interval Prediction Transformer for Timestamped Sequences. In2023 IEEE/ACIS 21st International Conference on Software Engineering Research, Management and Applications (SERA). IEEE, 319–325. doi:10.1109/sera57763.2023.10197830

  56. [56]

    Saiyedand and Irfan Al-Anbagi

    Makhduma F. Saiyedand and Irfan Al-Anbagi. 2024. Deep Ensemble Learning With Pruning for DDoS Attack Detection in IoT Networks.IEEE Transactions on Machine Learning in Communications and Networking2 (2024), 596–616. doi:10.1109/tmlcn.2024.3395419

  57. [57]

    2021.NetFlow Datasets for Machine Learning-Based Network Intrusion Detection Systems

    Mohanad Sarhan, Siamak Layeghy, Nour Moustafa, and Marius Portmann. 2021.NetFlow Datasets for Machine Learning-Based Network Intrusion Detection Systems. Springer International Publishing, 117–135. doi:10.1007/978-3-030-72802-1_9

  58. [58]

    Muhammad Shafiq, Zhihong Tian, Ali Kashif Bashir, Xiaojiang Du, and Mohsen Guizani. 2021. CorrAUC: A Malicious Bot-IoT Traffic Detection Method in IoT Network Using Machine-Learning Techniques.IEEE Internet of Things Journal8, 5 (March 2021), 3242–3254. doi:10.1109/jiot.2020.3002255

  59. [59]

    Mohaddeseh Shahhosseini, Hoda Mashayekhi, and Mohsen Rezvani. 2022. A Deep Learning Approach for Botnet Detection Using Raw Network Traffic Data.Journal of Network and Systems Management30, 3 (April 2022). doi:10.1007/s10922-022-09655-7

  60. [60]

    Ankit Sharma, Thiruvengadam Samon, Akhash Vellandurai, and Vinoth Kumar. 2023. TA-SAITS: Time Aware-Self Attention based Imputation of Time Series algorithm for Partially Observable Multi-Variate Time Series. In2023 International Conference on Machine Learning and Applications (ICMLA). IEEE, 2228–2233. doi:10.1109/icmla58977.2023.00336

  61. [61]

    Peter Shaw, Jakob Uszkoreit, and Ashish Vaswani. 2018. Self-Attention with Relative Position Representations. InProceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). Association for Computational Linguistics, 464–468. doi:10.18653/v1/n18-2074

  62. [62]

    Jayasumana, and Kanchana Thilakarathna

    Nirhoshan Sivaroopan, Dumindu Bandara, Chamara Madarasingha, Guillaume Jourjon, Anura P. Jayasumana, and Kanchana Thilakarathna

  63. [63]

    2024), 110616

    NetDiffus: Network traffic generation by diffusion models through time-series imaging.Computer Networks251 (Sept. 2024), 110616. doi:10.1016/j.comnet.2024.110616 Manuscript submitted to ACM 38 I. Panopoulos et al

  64. [64]

    Ziyang Song, Qincheng Lu, He Zhu, David Buckeridge, and Yue Li. 2025. TrajGPT: Irregular Time-Series Representation Learning of Health Trajectory.IEEE Journal of Biomedical and Health Informatics(2025), 1–14. doi:10.1109/jbhi.2025.3620205

  65. [65]

    Jianlin Su, Murtadha Ahmed, Yu Lu, Shengfeng Pan, Wen Bo, and Yunfeng Liu. 2024. RoFormer: Enhanced transformer with Rotary Position Embedding.Neurocomputing568 (Feb. 2024), 127063. doi:10.1016/j.neucom.2023.127063

  66. [66]

    Giannoutakis, Anastasios Drosou, and Dimitrios Tzovaras

    Petros Toupas, Dimitra Chamou, Konstantinos M. Giannoutakis, Anastasios Drosou, and Dimitrios Tzovaras. 2019. An Intrusion Detection System for Multi-class Classification Based on Deep Neural Networks. In2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA). IEEE, 1253–1258. doi:10.1109/icmla.2019.00206

  67. [67]

    Talia Tseriotou, Adam Tsakalidis, and Maria Liakata. 2024. TempoFormer: A Transformer for Temporally-aware Representations in Change Detection. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 19635–19653. doi:10.18653/v1/2024.emnlp-main.1095

  68. [68]

    Imtiaz Ullah and Qusay H. Mahmoud. 2020.A Scheme for Generating a Dataset for Anomalous Activity Detection in IoT Networks. Springer International Publishing, 508–520. doi:10.1007/978-3-030-47358-7_52

  69. [69]

    Imtiaz Ullah and Qusay H. Mahmoud. 2022. Design and Development of RNN Anomaly Detection Model for IoT Networks.IEEE Access10 (2022), 62722–62750. doi:10.1109/access.2022.3176317

  70. [70]

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. InAdvances in Neural Information Processing Systems, Vol. 30

  71. [71]

    Wei Wang, Yiqiang Sheng, Jinlin Wang, Xuewen Zeng, Xiaozhou Ye, Yongzhong Huang, and Ming Zhu. 2018. HAST-IDS: Learning Hierarchical Spatial- Temporal Features Using Deep Neural Networks to Improve Intrusion Detection.IEEE Access6 (2018), 1792–1806. doi:10.1109/access.2017.2780250

  72. [72]

    Yue Wang, Yiming Jiang, and Julong Lan. 2021. FCNN: An Efficient Intrusion Detection Method Based on Raw Network Traffic.Security and Communication Networks2021 (June 2021), 1–13. doi:10.1155/2021/5533269

  73. [73]

    Thai, Jiang Bian, Parisa Rashidi, and Zhe Jiang

    Tingsong Xiao, Zelin Xu, Wenchong He, Zhengkun Xiao, Yupu Zhang, Zibo Liu, Shigang Chen, My T. Thai, Jiang Bian, Parisa Rashidi, and Zhe Jiang. 2025. XTSFormer: Cross-Temporal-Scale Transformer for Irregular-Time Event Prediction in Clinical Applications.Proceedings of the AAAI Conference on Artificial Intelligence39, 27 (April 2025), 28502–28510. doi:10....

  74. [74]

    Dongyu Zhang, Liang Wang, Xin Dai, Shubham Jain, Junpeng Wang, Yujie Fan, Chin-Chia Michael Yeh, Yan Zheng, Zhongfang Zhuang, and Wei Zhang. 2023. FATA-Trans: Field And Time-Aware Transformer for Sequential Tabular Data. InProceedings of the 32nd ACM International Conference on Information and Knowledge Management (CIKM ’23). ACM, 3247–3256. doi:10.1145/3...

  75. [75]

    Xueqin Zhang, Jiahao Chen, Yue Zhou, Liangxiu Han, and Jiajun Lin. 2019. A Multiple-Layer Representation Learning Model for Network-Based Attack Detection.IEEE Access7 (2019), 91992–92008. doi:10.1109/access.2019.2927465

  76. [76]

    Yong Zhang, Xu Chen, Da Guo, Mei Song, Yinglei Teng, and Xiaojuan Wang. 2019. PCCN: Parallel Cross Convolutional Neural Network for Abnormal Network Traffic Flows Detection in Multi-Class Imbalanced Network Traffic Flows.IEEE Access7 (2019), 119904–119916. doi:10.1109/ access.2019.2933165

  77. [77]

    Yuanyun Zhang and Shi Li. 2025. ChronoFormer: Time-Aware Transformer Architectures for Structured Clinical Event Modeling. arXiv:2504.07373 [cs.LG] https://arxiv.org/abs/2504.07373

  78. [78]

    Ying Zhang and Qiang Liu. 2022. On IoT intrusion detection based on data augmentation for enhancing learning on unbalanced samples.Future Generation Computer Systems133 (Aug. 2022), 213–227. doi:10.1016/j.future.2022.03.007

  79. [79]

    Jilei Zhou, Guanran Jiang, Wei Du, and Cong Han. 2022. Profiling temporal learning interests with time-aware transformers and knowledge graph for online course recommendation.Electronic Commerce Research23, 4 (March 2022), 2357–2377. doi:10.1007/s10660-022-09541-z

  80. [80]

    Yujie Zhu, Dezhi Han, and Xinming Yin. 2021. A hierarchical network intrusion detection model based on unsupervised clustering. InProceedings of the 13th International Conference on Management of Digital EcoSystems (MEDES ’21). ACM, 22–29. doi:10.1145/3444757.3485098 Received 5 August 2025; revised 2 April 2026; accepted 9 April 2026 Manuscript submitted to ACM