arxiv: 2604.03790 · v1 · submitted 2026-04-04 · 💻 cs.CR

Recognition: no theorem link

Systematic Integration of Digital Twins and Constrained LLMs for Interpretable Cyber-Physical Anomaly Detection

Konstantinos E. Kampourakis , Vasileios Gkioulos , Sokratis Katsikas

Authors on Pith no claims yet

Pith reviewed 2026-05-13 17:15 UTC · model grok-4.3

classification 💻 cs.CR

keywords digital twinslarge language modelsindustrial control systemsanomaly detectioncyber-physical securitySWaT testbedconstrained reasoningattack localization

0 comments

The pith

A digital twin paired with constrained large language models detects and localizes cyber attacks on industrial control systems by using heuristics first and invoking the model only when needed.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a hybrid detection system for industrial control systems that maintains a synchronized digital twin of the physical process to generate behavioral descriptors. Deterministic heuristics flag known attack signatures such as spoofing or valve forcing, while the LLM supplies semantic reasoning only on ambiguous cases, with its outputs forced into a JSON schema and checked by semantic plausibility filters. A temporal smoothing layer then produces the final decision. Evaluation on four standard SWaT attack scenarios shows the detector localizes each attack interval precisely, keeps time-to-detection low, and records zero false positives in the tested benign periods, with the same outcome whether a local LLaMA or cloud GPT model is used. A sympathetic reader would care because the approach supplies both real-time alerts and human-interpretable explanations for why a given interval was flagged, addressing a gap in purely statistical anomaly detectors.

Core claim

The central claim is that a DT-driven hybrid detector, which combines deterministic heuristics with constrained LLM reasoning, achieves precise localization of each attack interval, low time-to-detect, and zero false positives in the evaluated benign region across four canonical SWaT scenarios; the same performance holds for both a local LLaMA model and a cloud-based GPT model.

What carries the argument

The constrained LLM reasoning step, invoked only when heuristics abstain and enforced by a JSON schema plus semantic plausibility filters that keep outputs physically consistent with the digital twin's representation.

Load-bearing premise

The digital twin must keep an accurate, synchronized, feature-enriched model of the physical process while the JSON schema and semantic filters must reliably force the LLM to produce only physically consistent outputs.

What would settle it

An experiment in which the detector either misses the exact start or end of an attack interval in one of the four SWaT scenarios or generates at least one false positive in the benign test region would show the central claim is not correct.

Figures

Figures reproduced from arXiv: 2604.03790 by Konstantinos E. Kampourakis, Sokratis Katsikas, Vasileios Gkioulos.

**Figure 1.** Figure 1: illustrates the overall DT architecture used in this work. The system begins with a replay server that streams a synchronized 30-second window of SWaT telemetry at each time step, providing a virtualized DT state aligned with the physical process. This window is enriched with behavioral descriptors, such as slopes, variances, freeze ratios, and actuator toggles, that form the DT’s interpretable feature-le… view at source ↗

**Figure 2.** Figure 2: Example of the DT replay windowing over SWaT-like telemetry. as attack and 395,336 rows (≈ 87, 9%) as benign, reflecting a realistically imbalanced ICS operating regime. All experiments are conducted using the publicly available dataset, which contains normal operation interspersed with groundtruth attack intervals. The DT operates on a 30-second sliding window and processes the dataset sequentially from… view at source ↗

read the original abstract

Cyber attacks targeting Industrial Control Systems (ICS) have become increasingly sophisticated and hard to identify. Detecting such attacks requires integrating low-level behavioral cues with high-level semantic interpretation, a capability that traditional anomaly detectors lack. This paper presents a Digital Twin (DT)-driven hybrid detection approach that combines deterministic heuristics with systematic, constrained Large Language Model (LLM) reasoning to achieve real-time incident detection. The DT maintains a synchronized, feature-enriched representation of the Secure Water Treatment (SWaT) process, deriving behavioral descriptors. Heuristics identify characteristic signatures of spoofing, valve forcing, denial-of-service, and bias drift, while the LLM is invoked only when heuristics abstain. A constrained JSON schema and semantic plausibility filters ensure physically consistent LLM outputs, and a temporal smoothing layer stabilizes the final decision signal. Evaluation on four canonical SWaT attack scenarios shows that the proposed detector precisely localizes each attack interval with low time-to-detect and zero False Positives (FPs) in the evaluated benign region. Results are consistent across both a local LLaMA model and a cloud-based GPT model, demonstrating the robustness of the constrained hybrid architecture. The findings highlight the potential of DT-guided LLM reasoning as a reliable and interpretable approach to ICS anomaly detection.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's hybrid DT-plus-constrained-LLM setup for SWaT attack detection is a reasonable engineering step but the zero-FP claim rests on unshown DT fidelity and filter behavior.

read the letter

The main thing here is a practical hybrid detector for ICS that runs deterministic heuristics first on digital-twin features and only hands off to an LLM when the heuristics are unsure. The LLM is forced to output JSON and then checked against simple physical plausibility rules before the final decision is smoothed. That gating pattern is the clearest new piece; prior work has used DTs or LLMs separately, but the explicit abstention-plus-filter flow tied to a live twin is not standard in the ICS security literature I know. The authors also show the same outcomes with both a local LLaMA and a cloud GPT, which is useful for reproducibility questions. Evaluation is limited to four canonical SWaT attack traces, where they report exact localization, low detection delay, and zero false positives in the tested benign window. That is a clean result on paper. The soft spot is that none of the supporting numbers for the twin itself are given: no state-estimation error on the sensor traces, no count of how many LLM calls were rejected or corrected by the filters, and no test of normal process variation outside the four scenarios. If the twin drifts or the semantic filters miss an edge case, the zero-FP number will not hold. The abstract is silent on these checks, so the central performance claim cannot be verified from what is shown. This is the kind of work that would interest people building real-time ICS monitors who already have a twin in place. It is not yet ready for a strong citation in my own papers because the validation gaps are too large. A serious editor should send it to review only if the full manuscript supplies the missing DT accuracy metrics, filter rejection rates, and at least one additional benign trace set; otherwise it stays in the preliminary category.

Referee Report

3 major / 0 minor

Summary. This paper proposes a hybrid anomaly detection system for Industrial Control Systems (ICS) that integrates a Digital Twin (DT) maintaining a synchronized, feature-enriched representation of the Secure Water Treatment (SWaT) process with deterministic heuristics for attack signatures (spoofing, valve forcing, DoS, bias drift) and a constrained LLM invoked only on heuristic abstention. A JSON schema and semantic plausibility filters enforce physically consistent LLM outputs, followed by temporal smoothing. The central empirical claim is that evaluation on four canonical SWaT attack scenarios yields precise attack-interval localization, low time-to-detect, and zero false positives in the evaluated benign region, with consistent results across local LLaMA and cloud GPT models.

Significance. If the performance claims hold under rigorous validation, the work would advance interpretable cyber-physical anomaly detection by showing how constrained LLMs can reliably augment heuristics in ICS settings without introducing hallucinations. The hybrid DT-LLM architecture and cross-model robustness offer a promising direction for explainable real-time detection, addressing limitations of purely statistical or rule-based methods.

major comments (3)

[Abstract] Abstract: The zero-FP claim in the evaluated benign region and precise localization with low TTD are load-bearing for the central performance assertion, yet the abstract supplies no quantitative validation of DT synchronization accuracy (e.g., state estimation error on SWaT sensor/actuator traces) or filter coverage (e.g., LLM rejection/correction statistics). Without these, it is unclear whether the result generalizes beyond the narrow evaluated window.
[Evaluation] Evaluation section: The reported strong empirical outcomes lack exact numerical metrics (e.g., TTD values, precision/recall), implementation details, or validation procedures for the four SWaT scenarios, preventing verification of whether the data support the stated claims of zero FPs and precise localization.
[Methodology] Methodology (constrained LLM subsection): The description of the JSON schema and semantic plausibility filters is insufficient to assess their reliability in forcing physically consistent outputs; no coverage metrics or failure cases are provided, which directly impacts the robustness claim across LLaMA and GPT.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed review. The comments correctly identify areas where additional quantitative detail and expanded descriptions would improve clarity and verifiability. We address each major comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: The zero-FP claim in the evaluated benign region and precise localization with low TTD are load-bearing for the central performance assertion, yet the abstract supplies no quantitative validation of DT synchronization accuracy (e.g., state estimation error on SWaT sensor/actuator traces) or filter coverage (e.g., LLM rejection/correction statistics). Without these, it is unclear whether the result generalizes beyond the narrow evaluated window.

Authors: We agree that the abstract would benefit from explicit quantitative anchors. In the revised manuscript we will add concise statements of DT synchronization accuracy (mean state estimation error across sensor/actuator traces) and LLM filter coverage (rejection and correction rates) to support the generalization claim while preserving the abstract's length constraints. revision: yes
Referee: [Evaluation] Evaluation section: The reported strong empirical outcomes lack exact numerical metrics (e.g., TTD values, precision/recall), implementation details, or validation procedures for the four SWaT scenarios, preventing verification of whether the data support the stated claims of zero FPs and precise localization.

Authors: The current evaluation section emphasizes qualitative outcomes and consistency across models. To enable direct verification we will expand it with exact numerical results (TTD in seconds, precision/recall per scenario), a table of per-scenario metrics, and a description of the validation procedure used on the four canonical SWaT traces. Implementation details and a pointer to the evaluation scripts will also be included. revision: yes
Referee: [Methodology] Methodology (constrained LLM subsection): The description of the JSON schema and semantic plausibility filters is insufficient to assess their reliability in forcing physically consistent outputs; no coverage metrics or failure cases are provided, which directly impacts the robustness claim across LLaMA and GPT.

Authors: We acknowledge that the current subsection provides only a high-level outline. The revision will supply the full JSON schema definition, enumerate the semantic plausibility rules, report coverage statistics (fraction of LLM invocations that triggered each filter), and include a brief discussion of any observed failure modes or edge cases encountered during the LLaMA and GPT experiments. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on external SWaT benchmark evaluation

full rationale

The paper's derivation chain presents a DT-LLM hybrid detector whose core claims (precise attack localization, low TTD, zero FPs in evaluated benign region) are justified by direct evaluation on four canonical SWaT scenarios rather than by any quantity defined in terms of the detector itself. No self-definitional steps appear (e.g., no parameter fitted to a subset then renamed as prediction), no load-bearing self-citations reduce the central result to prior author work, and no ansatz or uniqueness theorem is smuggled in. The abstract and described architecture treat DT synchronization and JSON filters as design choices whose reliability is asserted via the external benchmark results, not by construction. This matches the default expectation for non-circular papers; the skeptic's concerns address verification gaps, not circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on two domain assumptions: that the digital twin faithfully reproduces the SWaT process dynamics and that the imposed JSON schema plus semantic filters guarantee physically valid LLM outputs. No free parameters or invented entities are mentioned in the abstract.

axioms (2)

domain assumption The digital twin maintains a synchronized, feature-enriched representation of the SWaT process.
Invoked to derive behavioral descriptors used by both heuristics and the LLM.
domain assumption Constrained JSON schema and semantic plausibility filters ensure physically consistent LLM outputs.
Required for the hybrid decision layer to remain reliable.

pith-pipeline@v0.9.0 · 5535 in / 1429 out tokens · 52468 ms · 2026-05-13T17:15:55.386242+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

HySecTwin: A Knowledge-Driven Digital Twin Framework Augmented with Hybrid Reasoning for Cyber-Physical Systems
cs.CR 2026-05 unverdicted novelty 5.0

HySecTwin integrates semantic modeling with hybrid deterministic and fuzzy reasoning in digital twins to achieve faster, more interpretable threat detection in CPS with sub-millisecond latency.

Reference graph

Works this paper leans on

8 extracted references · 8 canonical work pages · cited by 1 Pith paper

[1]

MAD-LLM: A Novel Approach for Alert-Based Multi-stage Attack Detection via LLM

Dan Du et al. “MAD-LLM: A Novel Approach for Alert-Based Multi-stage Attack Detection via LLM”. In:2024 IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA). IEEE. 2024, pp. 2046–2053

work page 2024
[2]

A novel system for strengthening se- curity in large language models against hallucination and injection attacks with effective strategies

Tunahan Gokcimen and Bihter Das. “A novel system for strengthening se- curity in large language models against hallucination and injection attacks with effective strategies”. In:Alexandria Engineering Journal123 (2025), pp. 71–90

work page 2025
[3]

PLLM-CS: Pre-trained Large Language Model (LLM) for cyber threat detection in satellite networks

Mohammed Hassanin et al. “PLLM-CS: Pre-trained Large Language Model (LLM) for cyber threat detection in satellite networks”. In:Ad Hoc Networks 166 (2025), p. 103645

work page 2025
[4]

A survey on hallucination in large language models: Prin- ciples, taxonomy, challenges, and open questions

Lei Huang et al. “A survey on hallucination in large language models: Prin- ciples, taxonomy, challenges, and open questions”. In:ACM Transactions on Information Systems43.2 (2025), pp. 1–55

work page 2025
[5]

Crimson: Empowering strategic reasoning in cybersecu- rity through large language models

Jiandong Jin et al. “Crimson: Empowering strategic reasoning in cybersecu- rity through large language models”. In:2024 5th International Conference on Computer, Big Data and Artificial Intelligence (ICCBD+ AI). IEEE. 2024, pp. 18–24

work page 2024
[6]

Digital Twin-Enabled Incident Detec- tion and Response: A Systematic Review of Critical Infrastructures Ap- plications

Konstantinos E Kampourakis et al. “Digital Twin-Enabled Incident Detec- tion and Response: A Systematic Review of Critical Infrastructures Ap- plications”. In:International Journal of Information Security24.5 (2025), pp. 1–42

work page 2025
[7]

MoRSE: Bridging the gap in cybersecurity expertise withretrievalaugmentedgeneration

Marco Simoni et al. “MoRSE: Bridging the gap in cybersecurity expertise withretrievalaugmentedgeneration”.In:Proceedings of the 40th ACM/SIGAPP Symposium on Applied Computing. 2025, pp. 1213–1222

work page 2025
[8]

LLM- powered threat intelligence: Proactive detection of zero-day attacks in elec- tric vehicle cyber-physical systems

Aschalew Tirulo, Siddhartha Chauhan, and Miadreza Shafie-khah. “LLM- powered threat intelligence: Proactive detection of zero-day attacks in elec- tric vehicle cyber-physical systems”. In:Sustainable Energy, Grids and Net- works(2025), p. 101877

work page 2025