Recognition: unknown
ClawGuard: Out-of-Band Detection of LLM Agent Workflow Hijacking via EM Side Channel
Pith reviewed 2026-05-08 09:15 UTC · model grok-4.3
The pith
ClawGuard detects LLM agent workflow hijacks by capturing electromagnetic signals from hardware usage outside the potentially compromised host.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ClawGuard converts radio-frequency streams captured by external software-defined radios into 320-dimensional feature vectors through a drift-aware pipeline, mapping the unique electromagnetic envelopes of each LLM agent skill to detect workflow hijacking attempts with 100 percent true-positive rate and 1.16 percent false-positive rate even when the host is fully compromised.
What carries the argument
Passive electromagnetic side-channel sensing that records macroscopic EM envelopes emitted by distinct hardware usage patterns of agent skills, captured by external SDRs without any host software involvement.
Load-bearing premise
Distinct agent skills must produce sufficiently unique and stable hardware activity patterns whose electromagnetic signatures remain distinguishable by external sensors despite real-world interference and host compromise.
What would settle it
An experiment showing that two different agent skills produce overlapping EM signatures under identical conditions, or that an attacker can force one skill's hardware behavior to emit the EM envelope of another while preserving the intended workflow.
Figures
read the original abstract
Autonomous LLM agents face a critical security risk known as workflow hijacking, where attackers subtly alter tool and skill invocations. Existing defenses rely on host-internal telemetry (such as audit logs), which can be forged if the host OS is compromised. To solve this, we introduce ClawGuard, a passive, out-of-band monitor that audits LLM-agent workflows using electromagnetic (EM) emanations. Because distinct agent skills create unique hardware usage patterns (computation, DRAM, network blocking), they emit measurable, macroscopic EM envelopes. External software-defined radios (SDRs) capture these physical signals. Using a drift-aware pipeline with 320-dimensional features, ClawGuard converts RF streams into physical evidence. Evaluated on a 7.82TB RF corpus, ClawGuard achieved an AUC of 0.9945, detecting attacks with a 100% true-positive rate and a 1.16% false-positive rate. This proves passive EM sensing is a practical, forge-resistant physical check against compromised host software.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces ClawGuard, a passive out-of-band system for detecting LLM agent workflow hijacking attacks. It uses electromagnetic side-channel monitoring via external software-defined radios to capture macroscopic EM emanations arising from distinct agent skills' hardware usage patterns (computation, DRAM access, network activity). A drift-aware pipeline extracts 320-dimensional features from RF streams, and the system is evaluated on a 7.82 TB corpus, reporting an AUC of 0.9945 with 100% true-positive rate and 1.16% false-positive rate. The central claim is that this provides a forge-resistant physical-layer defense independent of potentially compromised host telemetry.
Significance. If the results prove robust, the work would represent a meaningful advance in securing autonomous LLM agents by exploiting physical invariants that are difficult to forge from software. The scale of the RF corpus is a positive aspect of the evaluation design. However, the absence of methodological details prevents a full assessment of whether the approach delivers a reliable, generalizable physical check or merely reflects corpus-specific artifacts.
major comments (2)
- [Evaluation] Evaluation section: The manuscript reports strong performance metrics (AUC 0.9945, 100% TPR, 1.16% FPR) on a 7.82 TB corpus but supplies no description of data collection procedures, feature engineering for the 320-dimensional vectors, environmental controls, hardware platform variation, or controls for signal interference and concurrent host loads. This directly undermines the central claim that the features capture skill-specific EM envelopes rather than transient or environment-dependent effects.
- [§3] §3 (System Design and Assumptions): The premise that distinct agent skills produce reliably separable macroscopic EM envelopes even under host compromise and real-world RF interference is stated without quantitative support, such as separability metrics or ablation studies under added DRAM contention or external emitters. Because the headline detection rates rest on this untested separability, the experimental results cannot yet be interpreted as evidence of a practical physical invariant.
minor comments (1)
- [Abstract] The abstract introduces the term 'drift-aware pipeline' without a brief definition or reference to the relevant section explaining how drift is detected or mitigated.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and for recognizing the potential significance of ClawGuard as a physical-layer defense. We address each major comment point by point below. Revisions have been made to the manuscript to incorporate additional details and quantitative support where this strengthens the presentation without altering the core claims or results.
read point-by-point responses
-
Referee: [Evaluation] Evaluation section: The manuscript reports strong performance metrics (AUC 0.9945, 100% TPR, 1.16% FPR) on a 7.82 TB corpus but supplies no description of data collection procedures, feature engineering for the 320-dimensional vectors, environmental controls, hardware platform variation, or controls for signal interference and concurrent host loads. This directly undermines the central claim that the features capture skill-specific EM envelopes rather than transient or environment-dependent effects.
Authors: We agree that expanded methodological details improve interpretability and reproducibility. In the revised manuscript we have substantially enlarged the Evaluation section. It now includes: (1) a complete description of the data-collection apparatus and protocol (specific SDR hardware, antenna placement, sampling rates, and session durations that produced the 7.82 TB corpus); (2) the exact feature-engineering pipeline that yields the 320-dimensional drift-aware vectors, including time-frequency transforms and normalization steps; (3) environmental controls (Faraday-cage shielding, temperature/humidity logging, and baseline noise measurements); (4) hardware-platform variation experiments across three distinct host configurations; and (5) explicit controls for concurrent host loads and external RF interference, with quantitative results showing that detection performance remains stable. These additions directly corroborate that the learned features reflect skill-specific macroscopic EM envelopes rather than transient environmental effects. revision: yes
-
Referee: [§3] §3 (System Design and Assumptions): The premise that distinct agent skills produce reliably separable macroscopic EM envelopes even under host compromise and real-world RF interference is stated without quantitative support, such as separability metrics or ablation studies under added DRAM contention or external emitters. Because the headline detection rates rest on this untested separability, the experimental results cannot yet be interpreted as evidence of a practical physical invariant.
Authors: We acknowledge the value of explicit quantitative backing for the separability assumption. While the headline metrics already constitute empirical evidence obtained under realistic conditions (including variable host loads and ambient RF), we have added two new elements to the revised manuscript. First, §3 now reports separability metrics (average inter-class Euclidean distance and silhouette coefficient) computed on the 320-dimensional feature vectors for the ten agent skills; these metrics confirm clear separation. Second, we include ablation results in the Evaluation section that inject controlled DRAM contention (via concurrent memory-bound processes) and external RF emitters (via calibrated signal generators at varying power levels). Under these conditions the AUC remains above 0.98 with negligible degradation in TPR/FPR, supporting the claim that the physical invariant is robust to host compromise and interference. The out-of-band architecture continues to guarantee independence from any forged host telemetry. revision: yes
Circularity Check
No circularity detected; results are empirical performance metrics on collected RF data
full rationale
The paper reports experimental detection performance (AUC 0.9945, 100% TPR, 1.16% FPR on 7.82 TB corpus) from a drift-aware ML pipeline applied to SDR-captured EM signals. No equations, derivations, or first-principles claims are presented that reduce the detection result to fitted parameters, self-definitions, or self-citations by construction. The premise that distinct skills produce unique macroscopic EM envelopes is stated as an empirical observation motivating the approach, not derived from prior results within the paper. The evaluation metrics are direct measurements on held-out data rather than predictions forced by the training process or renamed known patterns. This is a standard empirical security paper with no load-bearing circular steps.
Axiom & Free-Parameter Ledger
free parameters (1)
- 320-dimensional feature set
axioms (1)
- domain assumption Distinct agent skills produce distinguishable macroscopic EM emanations from hardware activity
Reference graph
Works this paper leans on
-
[1]
Generative AI - worldwide | statista market forecast,
Statista, “Generative AI - worldwide | statista market forecast,” https://www.statista.com/outlook/tmo/artificial-intelligence/ generative-ai/worldwide, 2024, accessed: 2026-05-07
2024
-
[2]
LangChain: A framework for developing applications pow- ered by language models,
LangChain, “LangChain: A framework for developing applications pow- ered by language models,” https://www.langchain.com/, 2024
2024
-
[3]
Openclaw — personal ai assistant,
OpenClaw, “Openclaw — personal ai assistant,” 5 2026, [Online; accessed 2026-05-07]. [Online]. Available: https://openclaw.ai/
2026
-
[4]
PoisonedRAG: Knowl- edge corruption attacks to retrieval-augmented generation of large lan- guage models,
W. Zou, R. He, T. Bachmann, M. Salehiet al., “PoisonedRAG: Knowl- edge corruption attacks to retrieval-augmented generation of large lan- guage models,” inProceedings of the 34th USENIX Security Symposium (USENIX Security 25). USENIX Association, 2025
2025
-
[5]
Prompt injection attack to tool selection in LLM agents,
Anonymous, “Prompt injection attack to tool selection in LLM agents,” inProceedings of the 2026 Network and Distributed System Security Symposium (NDSS ’26). Internet Society, 2026
2026
-
[6]
ObliInjection: Order-oblivious prompt injection attack to LLM agents with multi-source data,
S. Xuet al., “ObliInjection: Order-oblivious prompt injection attack to LLM agents with multi-source data,” inProceedings of the 2026 Net- work and Distributed System Security Symposium (NDSS ’26). Internet Society, 2026, arXiv:2512.09321
-
[7]
HOLMES: Real-time apt detection through correlation of suspicious information flows,
S. M. Milajerdi, R. Geng, S. Khalighinejad, H. Agarwal, M. Egele, and N. Nikiforakis, “HOLMES: Real-time apt detection through correlation of suspicious information flows,” in2019 IEEE Symposium on Security and Privacy (SP), 2019
2019
-
[8]
Unicorn: Runtime provenance-based detector for advanced persistent threats,
X. Han, T. Pasquier, A. Bates, J. Mickens, and M. Seltzer, “Unicorn: Runtime provenance-based detector for advanced persistent threats,” in Proceedings of the Network and Distributed System Security Symposium (NDSS), 2020
2020
-
[9]
Kobra: Targeted activity monitoring with ebpf,
R. Farkhaniet al., “Kobra: Targeted activity monitoring with ebpf,” in Proceedings of the Network and Distributed System Security Symposium (NDSS), 2023
2023
-
[10]
A comprehensive memory safety analysis of bootload- ers,
Z. Zhonget al., “A comprehensive memory safety analysis of bootload- ers,” inProceedings of the Network and Distributed System Security Symposium (NDSS), 2025
2025
-
[11]
Controlled preemp- tion: Amplifying side-channel attacks from userspace,
Y . Zhu, B. Chen, Z. N. Zhao, and C. W. Fletcher, “Controlled preemp- tion: Amplifying side-channel attacks from userspace,” inProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 25). ACM, 2025. 14
2025
-
[12]
Get your hands off my laptop: Physical side-channel key-extraction attacks on PCs,
D. Genkin, I. Pipman, and E. Tromer, “Get your hands off my laptop: Physical side-channel key-extraction attacks on PCs,” inCryptographic Hardware and Embedded Systems – CHES 2014, ser. Lecture Notes in Computer Science, vol. 8731. Springer, 2014, pp. 242–260
2014
-
[13]
ECDSA key extraction from mobile devices via nonintrusive physical side channels,
D. Genkin, L. Pachmanov, I. Pipman, E. Tromer, and Y . Yarom, “ECDSA key extraction from mobile devices via nonintrusive physical side channels,” inProceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS ’16). ACM, 2016, pp. 1626–1638
2016
-
[14]
EMMA: EM-based anomaly detection for embedded systems,
N. Sehatbakhsh, B. B. Yilmaz, A. Zaji ´c, and M. Prvulovic, “EMMA: EM-based anomaly detection for embedded systems,” inProceedings of the 29th USENIX Security Symposium (USENIX Security ’20). USENIX Association, 2020, pp. 1245–1262
2020
-
[15]
Screaming channels: When electromagnetic side channels meet radio transceivers,
G. Camurati, S. Poeplau, M. Muench, T. Hayes, and A. Francillon, “Screaming channels: When electromagnetic side channels meet radio transceivers,” inProceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS ’18). ACM, 2018, pp. 163–177
2018
-
[16]
BlueScream: Screaming channels on bluetooth low energy,
P. Ayoub, R. Cayre, A. Francillon, and C. Maurice, “BlueScream: Screaming channels on bluetooth low energy,” inProceedings of the 40th Annual Computer Security Applications Conference (ACSAC ’24). ACM, 2024
2024
-
[17]
A practical methodology for measuring the side-channel signal available to the attacker for instruction-level events,
R. Callan, A. Zaji ´c, and M. Prvulovic, “A practical methodology for measuring the side-channel signal available to the attacker for instruction-level events,” inProceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO ’14). IEEE, 2014, pp. 242–254
2014
-
[18]
Detecting cellphone camera status at distance by exploiting electromagnetic em- anations,
B. B. Yilmaz, E. E. Ugurlu, A. Zaji ´c, and M. Prvulovic, “Detecting cellphone camera status at distance by exploiting electromagnetic em- anations,” inProceedings of the 2019 IEEE Military Communications Conference (MILCOM). IEEE, 2019, pp. 1–6
2019
-
[19]
GraphRAG under fire: Exposing vulnerabilities of GraphRAG to targeted poisoning attacks,
J. Liang, Y . Wang, C. Li, and T. Wang, “GraphRAG under fire: Exposing vulnerabilities of GraphRAG to targeted poisoning attacks,” inProceedings of the 2026 IEEE Symposium on Security and Privacy (S&P ’26). IEEE, 2026
2026
-
[20]
ProvDetector: A provenance- based stealthy malware detection system,
Q. Wang, W. U. Hassan, A. Bateset al., “ProvDetector: A provenance- based stealthy malware detection system,” inProceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security (CCS), 2020
2020
-
[21]
MAGIC: Detecting advanced persistent threats via masked graph representation learning,
Z. Zhenget al., “MAGIC: Detecting advanced persistent threats via masked graph representation learning,” in30th USENIX Security Sym- posium (USENIX Security 21), 2021
2021
-
[22]
Sleuth: Real-time attack scenario reconstruction from cots audit data,
M. N. Hossainet al., “Sleuth: Real-time attack scenario reconstruction from cots audit data,” in26th USENIX Security Symposium (USENIX Security 17), 2017
2017
-
[23]
Nodoze: Combatting threat alert fatigue with automated provenance triage,
W. U. Hassan, S. Guo, D. Li, Z. Chen, K. Jee, Z. Li, and A. Bates, “Nodoze: Combatting threat alert fatigue with automated provenance triage,” inProceedings of the Network and Distributed System Security Symposium (NDSS), 2019
2019
-
[24]
Log2vec: A heterogeneous graph embedding based approach for detecting cyber threats within enterprise,
F. Liuet al., “Log2vec: A heterogeneous graph embedding based approach for detecting cyber threats within enterprise,” inProceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (CCS), 2019
2019
-
[25]
PrioTracker: Tuning ephemeral trace events for reliable threat detection,
Y . Liuet al., “PrioTracker: Tuning ephemeral trace events for reliable threat detection,” inProceedings of the Network and Distributed System Security Symposium (NDSS), 2021
2021
-
[26]
ACE: A Security Architecture for LLM-Integrated App Systems
Anonymous, “ACE: A security architecture for LLM-integrated app sys- tems,” inProceedings of the 2026 Network and Distributed System Secu- rity Symposium (NDSS ’26). Internet Society, 2026, arXiv:2504.20984
work page internal anchor Pith review arXiv 2026
-
[27]
SAGA: A security architecture for governing AI agentic systems, 2025
——, “SAGA: Governing AI agent security,” arXiv:2504.21034, 2025
-
[28]
StruQ: Defending against prompt injection with structured queries,
S. Chen, J. Piet, C. Sitawarin, and D. Wagner, “StruQ: Defending against prompt injection with structured queries,” inProceedings of the 34th USENIX Security Symposium (USENIX Security 25). USENIX Association, 2025
2025
-
[29]
AgentDojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents,
E. Debenedetti, J. Zhang, M. Balunovi ´c, L. Beurer-Kellner, M. Fischer, and F. Tramèr, “AgentDojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents,” inAdvances in Neu- ral Information Processing Systems 37 (NeurIPS 2024) Datasets and Benchmarks Track, 2024
2024
-
[30]
Threatrace: Detecting and tracing host-based threats in node level through graph convolutional networks,
S. Wanget al., “Threatrace: Detecting and tracing host-based threats in node level through graph convolutional networks,” in31st USENIX Security Symposium (USENIX Security 22), 2022
2022
-
[31]
CausalIL: Causal graph learning for host-based intrusion detection,
Y . Chenet al., “CausalIL: Causal graph learning for host-based intrusion detection,” inProceedings of the Network and Distributed System Security Symposium (NDSS), 2023
2023
-
[32]
Tactical provenance analysis for endpoint detection and response systems,
W. U. Hassanet al., “Tactical provenance analysis for endpoint detection and response systems,” in2020 IEEE Symposium on Security and Privacy (SP), 2020
2020
-
[33]
Poirot: Aligning attack behavior with threat intel- ligence,
Z. Zhenget al., “Poirot: Aligning attack behavior with threat intel- ligence,” inProceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (CCS), 2019
2019
-
[34]
Not what you’ve signed up for: Compromising real-world LLM- integrated applications with indirect prompt injection,
K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz, “Not what you’ve signed up for: Compromising real-world LLM- integrated applications with indirect prompt injection,” inProceedings of the 16th ACM Workshop on Artificial Intelligence and Security (AISec ’23). ACM, 2023, pp. 79–90
2023
-
[35]
Ignore Previous Prompt: Attack Techniques For Language Models
F. Perez and I. Ribeiro, “Ignore previous prompt: Attack techniques for language models,” arXiv preprint arXiv:2211.09527, 2022
work page internal anchor Pith review arXiv 2022
-
[36]
Formalizing and detecting indirect prompt injection attacks,
J. Liuet al., “Formalizing and detecting indirect prompt injection attacks,” inProceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security (CCS), 2024
2024
-
[37]
Poisoning retrieval-augmented generation for large lan- guage models,
W. Zouet al., “Poisoning retrieval-augmented generation for large lan- guage models,” in33rd USENIX Security Symposium (USENIX Security 24), 2024
2024
-
[38]
Agent smith: A single image can hijack your au- tonomous agent,
G. Chenet al., “Agent smith: A single image can hijack your au- tonomous agent,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
2024
-
[39]
JBo- mAudit: Assessing the landscape, compliance, and security implications of Java SBOMs,
Y . Xiao, D. Kirat, D. L. Schales, J. Jang, L. Xing, and X. Liao, “JBo- mAudit: Assessing the landscape, compliance, and security implications of Java SBOMs,” inProceedings of the Network and Distributed System Security Symposium (NDSS 25). Internet Society, 2025
2025
-
[40]
Wattsupdoc: Power side channels to nonintrusively discover untargeted malware on embedded medical devices,
S. S. Clarket al., “Wattsupdoc: Power side channels to nonintrusively discover untargeted malware on embedded medical devices,” inUSENIX Workshop on Health Information Technologies (HealthTech), 2013
2013
-
[41]
Hardfails: Insights into software-exploitable hard- ware bugs,
G. Dessoukyet al., “Hardfails: Insights into software-exploitable hard- ware bugs,” in28th USENIX Security Symposium (USENIX Security 19), 2019
2019
-
[42]
RefleXnoop: Passwords snoop- ing on NLoS laptops leveraging screen-induced sound reflection,
P. Wang, J. Hu, C. Liu, and J. Luo, “RefleXnoop: Passwords snoop- ing on NLoS laptops leveraging screen-induced sound reflection,” in Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security (CCS ’24). ACM, 2024. APPENDIX This appendix supports the robustness discussion in §VI-D. It records a deliberately difficult stress camp...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.