An Application-Layer Multi-Modal Covert-Channel Reference Monitor for LLM Agent Egress

Alfredo Metere

arxiv: 2605.20734 · v1 · pith:2BF32T5Cnew · submitted 2026-05-20 · 💻 cs.CR · cs.AI

An Application-Layer Multi-Modal Covert-Channel Reference Monitor for LLM Agent Egress

Alfredo Metere This is my paper

Pith reviewed 2026-05-21 04:33 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords covert channelsLLM agentsegress monitoringreference monitormulti-modal securitycryptographic attestationcapacity reduction

0 comments

The pith

A reference monitor for LLM agent egress drives residual covert-channel capacity to zero on all destroyable carriers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper describes a system that stops LLM agents from leaking information by hiding bits inside ordinary-looking messages or media. It applies a ten-stage text pipeline with a leaky-bucket ledger, plus specialized scramblers for audio and images that only skip processing when a boot-time Ed25519 signature matches an authorized kind and data class. Residual capacity is measured as Miller-Madow corrected mutual information between what an adversary tries to embed and what can later be recovered, and the implementation reduces this quantity to zero wherever destruction is possible. A reader would care because standard allowlists and content scanners miss encodings such as zero-width characters, JSON key order, LSB planes, or inter-image permutations.

Core claim

The reference implementation drives residual capacity to zero on every destroyable channel and to a stated bound on the one (per-image mean luminance) that cannot be destroyed without ruining the image. This is achieved with a text pipeline of ten capacity-reducing stages and per-sink leaky-bucket ledger, two media scramblers (Fourier-domain audio band-limiter and RGB image bit-depth and mean-luminance bucketer) that are gated by a boot-time Ed25519 attestation of trusted keys and {kind, data-class} pairs, a content-addressed canonicalizer to close permutation channels, and measurement by an adversarial ensemble of fifteen encoders across modalities.

What carries the argument

The multi-modal egress reference monitor that combines staged lossless text reductions, attestation-gated media scramblers, and mutual-information capacity measurement.

If this is right

LLM agents lose the ability to exfiltrate data through text encodings such as homoglyphs or whitespace once the ten-stage pipeline is applied.
Unsigned or mismatched media is blocked or scrambled by default, removing the need for content-based classification of real versus sonified payloads.
Inter-image permutation channels are closed by the content-addressed canonicalizer before any scrambling occurs.
Capacity is quantified to zero on destroyable channels and to an explicit bound on the luminance channel that must remain intact.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same attestation-plus-scrambler pattern could be applied to other agent output types such as network packets or API responses.
A production deployment would need to measure end-to-end latency and false-positive rates on real LLM traffic.
The stated luminance bound implies a tunable security-versus-quality trade-off that future work could optimize with different bucketing parameters.

Load-bearing premise

Boot-time Ed25519 keys and {kind, data-class} pairs can be published and verified so that only legitimate media is exempted while all other payloads are treated as suspect, without creating new covert channels or false blocks.

What would settle it

An experiment in which an adversarial encoder embeds recoverable bits into an attested image and recovers them above the stated mean-luminance bound after the monitor has processed the image would falsify the residual-capacity claim.

Figures

Figures reproduced from arXiv: 2605.20734 by Alfredo Metere.

**Figure 2.** Figure 2: Design-space probe images. The same 128×128 cover under twelve representative defense settings. The cover is a smooth additive RGB gradient — R, G, and B all grow simultaneously along different axes, so the visible result is a continuous diagonal traversal of the gamut rather than separable red, green, and blue bands: from dark blue-grey in the top-left (all three channels low) through magenta-ish (top-rig… view at source ↗

**Figure 3.** Figure 3: Audio-scrambler spectral defense in action. Magnitude spectra (dB relative to the loudest [PITH_FULL_IMAGE:figures/full_fig_p018_3.png] view at source ↗

read the original abstract

A large language model (LLM) agent that sends messages can leak data inside them. Destination allowlists and content scanners do not police whether an otherwise-benign payload is itself a covert channel: a compromised agent encodes bits in zero-width characters, homoglyphs, whitespace, base64, JavaScript Object Notation (JSON) key ordering, message timing or size -- and, in binary egress, in least-significant-bit (LSB) pixel planes, per-image mean luminance, inter-image sequence permutation, ultrasonic tones, or audible-band sonified data. Our egress reference monitor has three contributions. (i) A text pipeline of ten capacity-reducing stages, a per-sink leaky-bucket capacity ledger, and a staged posture that enforces lossless stages from day one. (ii) Two media scramblers (a Fourier-domain audio band-limiter and a red-green-blue (RGB) image bit-depth and mean-luminance bucketer) gated by a boot-time cryptographic legitimacy attestation: an auditor publishes at boot the trusted Ed25519 keys and {kind, data-class} pairs; only payloads with a verifying signature for an authorized class are exempt. The attestation sidesteps the intractable content-based discrimination between real media and data sonified or rasterized as a carrier; unsigned media is suspect by default; a content-addressed canonicalizer closes the inter-image permutation channel. (iii) Residual capacity is the Miller--Madow corrected mutual information between embedded and recovered bits (zero when destroyed), measured by an adversarial ensemble of fifteen working encoders across text, image and audio. The reference implementation drives residual capacity to zero on every destroyable channel and to a stated bound on the one (per-image mean luminance) that cannot be destroyed without ruining the image.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper offers a workable architecture for limiting multi-modal covert channels from LLM agents but needs more on implementation and attestation security.

read the letter

This paper puts forward a reference monitor that aims to block covert channels in the multi-modal outputs of LLM agents. It layers capacity-reducing stages on text, applies targeted scramblers to images and audio, and uses cryptographic attestation to permit legitimate media while flagging the rest. What stands out as new is the integration of a ten-stage text pipeline with a per-sink leaky-bucket ledger, a Fourier-domain audio band-limiter, an RGB image bit-depth and mean-luminance bucketer, and a boot-time Ed25519 attestation system. The attestation lets an auditor publish trusted keys and {kind, data-class} pairs at boot so that only verifying payloads are exempt; everything else is suspect by default. A content-addressed canonicalizer handles the inter-image permutation channel. Residual capacity is quantified via Miller-Madow corrected mutual information between embedded and recovered bits, tested against an adversarial ensemble of fifteen encoders. The implementation is said to drive this to zero on destroyable channels and to a bound on the luminance channel. The design does a solid job of addressing a growing practical risk in AI deployments where agents can hide data in normal-looking text or media. The application-layer focus makes it relevant where lower-level network controls are not sufficient. The soft spots are in the missing details. The abstract gives no implementation code, no error analysis for the capacity measurements, and no explicit check that the fifteen encoders cover every possible encoding method. The attestation is central, but the description leaves open whether the verification process itself could leak bits through key choice, publication timing, or metadata in the canonicalizer. The bound on the luminance channel is stated without showing the independent derivation in the provided summary. This work is aimed at security practitioners and researchers who design controls for LLM agents in enterprise or regulated settings. Readers who want concrete system architectures for reducing exfiltration risks will get the most value from it. The paper deserves a serious referee because it offers measurable claims on a timely problem and builds on established ideas in a useful way. I would recommend sending it to peer review, asking the authors to supply the implementation details, the full measurement results, and a clearer analysis of whether the attestation protocol avoids creating new channels.

Referee Report

3 major / 2 minor

Summary. The manuscript describes an application-layer reference monitor for LLM agent egress that aims to eliminate covert channels across text, image, and audio. It contributes (i) a text pipeline of ten capacity-reducing stages with a per-sink leaky-bucket ledger and staged enforcement, (ii) Fourier-domain audio band-limiting and RGB image bit-depth/mean-luminance bucketing gated by boot-time Ed25519 attestation of trusted keys and {kind, data-class} pairs (with unsigned media treated as suspect and a content-addressed canonicalizer for permutation channels), and (iii) residual capacity measured as Miller-Madow corrected mutual information between embedded and recovered bits via an adversarial ensemble of fifteen encoders. The reference implementation is asserted to drive this residual capacity to zero on all destroyable channels and to a stated bound on the non-destroyable per-image mean luminance channel.

Significance. If the empirical claims hold under the stated measurement protocol, the work provides a concrete, deployable defense against a broad class of covert exfiltration vectors that evade destination allowlists and content scanners. The cryptographic attestation mechanism to exempt legitimate media without content-based discrimination, combined with the adversarial ensemble evaluation and leaky-bucket accounting, represents a practical engineering contribution that could serve as a reference for securing LLM agents. Credit is due for the explicit use of Miller-Madow correction and the multi-modal scope.

major comments (3)

[Abstract] Abstract: the central claim that the reference implementation drives residual capacity to zero on every destroyable channel provides no implementation details, error analysis, or verification that the ensemble of fifteen encoders covers all possible channels (including timing, size, JSON ordering, and sonification variants).
[Abstract] Abstract: the stated bound on residual capacity for the per-image mean luminance channel is asserted without an independent derivation, justification of the bound value, or analysis of how the bucketer interacts with image usability constraints.
[Attestation mechanism] Attestation and exemption mechanism: the claim that boot-time Ed25519 attestation plus {kind, data-class} pairs reliably exempts only legitimate media without introducing new covert channels (via key selection, publication timing, or canonicalizer metadata) lacks a concrete protocol description or zero-leakage argument.

minor comments (2)

[Abstract] Abstract: the text pipeline is described as having 'ten capacity-reducing stages' but the stages are not enumerated; a table or explicit list would improve clarity and allow readers to assess coverage.
[Evaluation] The Miller-Madow estimator is invoked as standard, yet its specific application (binning, sample size, correction term) to the recovered-bit measurements should be stated explicitly for reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback and for acknowledging the significance of our application-layer reference monitor for LLM agent egress. We address each major comment in detail below, indicating the revisions planned for the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the reference implementation drives residual capacity to zero on every destroyable channel provides no implementation details, error analysis, or verification that the ensemble of fifteen encoders covers all possible channels (including timing, size, JSON ordering, and sonification variants).

Authors: The manuscript's full text details the implementation of the ten capacity-reducing stages in the text pipeline, the per-sink leaky-bucket ledger, and the staged enforcement. The adversarial ensemble of fifteen encoders is specified to cover a range of channels, explicitly including timing, size, JSON ordering, and sonification variants as part of the evaluation across text, image, and audio. The Miller-Madow correction provides the statistical error analysis for the mutual information estimates. To address the concern about the abstract, we will revise it to include a concise reference to these elements and the comprehensive nature of the encoder ensemble. revision: yes
Referee: [Abstract] Abstract: the stated bound on residual capacity for the per-image mean luminance channel is asserted without an independent derivation, justification of the bound value, or analysis of how the bucketer interacts with image usability constraints.

Authors: The bound is based on the quantization introduced by the mean-luminance bucketer, which is designed to limit covert capacity while preserving image usability. We will add to the revised manuscript an independent derivation of the bound, including justification of its value derived from the bucketing granularity and an analysis of its interaction with usability constraints using standard metrics such as PSNR or perceptual quality assessments. revision: yes
Referee: [Attestation mechanism] Attestation and exemption mechanism: the claim that boot-time Ed25519 attestation plus {kind, data-class} pairs reliably exempts only legitimate media without introducing new covert channels (via key selection, publication timing, or canonicalizer metadata) lacks a concrete protocol description or zero-leakage argument.

Authors: The attestation protocol is outlined as an auditor publishing trusted Ed25519 keys and authorized {kind, data-class} pairs at boot time, with verification required for exemption and unsigned media treated as suspect. The content-addressed canonicalizer addresses permutation channels. We maintain that this setup avoids new covert channels because the keys and classes are fixed at boot and not selectable by the agent at runtime, timing is outside agent control, and metadata is minimized. However, we will provide a more detailed step-by-step protocol description and a formal argument for the absence of leakage in the revised manuscript. revision: partial

Circularity Check

0 steps flagged

No circularity: results are empirical measurements on an implemented monitor

full rationale

The paper describes concrete pipelines (ten text stages, Fourier audio limiter, RGB bit-depth and luminance bucketer) plus a boot-time Ed25519 attestation mechanism, then reports measured residual capacity via Miller-Madow corrected mutual information on an adversarial encoder ensemble. No equations are presented that reduce a claimed first-principles result to the inputs by construction, no parameters are fitted on a subset and then relabeled as predictions, and no self-citations or uniqueness theorems are invoked to justify core choices. The stated bound on the luminance channel is presented as an empirical limit required to preserve image utility rather than a tautological re-expression of the measurement definition itself. The derivation chain is therefore self-contained and externally falsifiable through the described implementation and measurement procedure.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Abstract provides limited detail on internal parameters or assumptions; the leaky-bucket ledger and per-sink capacities are mentioned without explicit values or derivation.

free parameters (1)

per-sink leaky-bucket capacity
Capacity limits per destination sink are referenced but no numerical values or fitting procedure are given in the abstract.

axioms (1)

domain assumption Cryptographic attestation at boot can be trusted to publish valid Ed25519 keys and class pairs
The monitor relies on this to exempt signed media without content inspection.

pith-pipeline@v0.9.0 · 5847 in / 1271 out tokens · 25682 ms · 2026-05-21T04:33:37.459763+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The reference implementation drives residual capacity to zero on every destroyable channel and to a stated bound on the one (per-image mean luminance) that cannot be destroyed without ruining the image.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

A boot-time cryptographic legitimacy attestation: an auditor publishes at boot the trusted Ed25519 keys and {kind, data-class} pairs

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages · 1 internal anchor

[1]

Elliott Bell and Leonard J

D. Elliott Bell and Leonard J. LaPadula. Secure computer system: Unified exposition and multics interpretation. Technical Report MTR-2997 Rev. 1, The MITRE Corporation, 1976

work page 1976
[2]

Bernstein, Niels Duif, Tanja Lange, Peter Schwabe, and Bo-Yin Yang

Daniel J. Bernstein, Niels Duif, Tanja Lange, Peter Schwabe, and Bo-Yin Yang. High-speed high-security signatures.Journal of Cryptographic Engineering, 2(2):77–89, 2012

work page 2012
[3]

Trojan source: Invisible vulnerabilities

Nicholas Boucher and Ross Anderson. Trojan source: Invisible vulnerabilities. InUSENIX Security Symposium, 2023

work page 2023
[4]

Brodley, and Clay Shields

Serdar Cabuk, Carla E. Brodley, and Clay Shields. Ip covert timing channels: Design and detection. In ACM Conference on Computer and Communications Security (CCS), pages 178–187, 2004

work page 2004
[5]

Zico Kolter, Jakob Foerster, and D

Christian Schroeder de Witt, Samuel Sokota, J. Zico Kolter, Jakob Foerster, and D. J. Strouse. Perfectly secure steganography using minimum entropy coupling. InInternational Conference on Learning Representations (ICLR), 2023

work page 2023
[6]

Dorothy E. Denning. A lattice model of secure information flow.Communications of the ACM, 19(5): 236–243, 1976

work page 1976
[7]

Denning and Peter J

Dorothy E. Denning and Peter J. Denning. Certification of programs for secure information flow. Communications of the ACM, 20(7):504–513, 1977

work page 1977
[8]

Inaudible sound as a covert channel in mobile devices

Luke Deshotels. Inaudible sound as a covert channel in mobile devices. InUSENIX Workshop on Offensive Technologies (WOOT), 2014. 24

work page 2014
[9]

Dyer, Scott E

Kevin P. Dyer, Scott E. Coull, Thomas Ristenpart, and Thomas Shrimpton. Peek-a-boo, i still see you: Why efficient traffic analysis countermeasures fail. InIEEE Symposium on Security and Privacy, pages 332–346, 2012

work page 2012
[10]

Ascii smuggling and hidden prompt instructions: Unicode tag characters in llm applications, 2024.https://embracethered.com/blog/

Embrace the Red. Ascii smuggling and hidden prompt instructions: Unicode tag characters in llm applications, 2024.https://embracethered.com/blog/

work page 2024
[11]

Detecting LSB steganography in color and gray-scale images

Jessica Fridrich, Miroslav Goljan, and Rui Du. Detecting LSB steganography in color and gray-scale images. InIEEE MultiMedia, volume 8, pages 22–28, 2001

work page 2001
[12]

Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injection

Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injection. InACM Workshop on Artificial Intelligence and Security (AISec), 2023

work page 2023
[13]

Neuhoff, editors.The Sonification Handbook

Thomas Hermann, Andy Hunt, and John G. Neuhoff, editors.The Sonification Handbook. Logos Verlag, 2011

work page 2011
[14]

Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations

Hakan Inan et al. Llama Guard: Llm-based input-output safeguard for human-ai conversations. In arXiv:2312.06674, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[15]

Kemmerer

Richard A. Kemmerer. Shared resource matrix methodology: An approach to identifying storage and timing channels. InACM Transactions on Computer Systems, volume 1, pages 256–277, 1983

work page 1983
[16]

An information-theoretic model for adaptive side-channel attacks

Boris Köpf and David Basin. An information-theoretic model for adaptive side-channel attacks. InACM Conference on Computer and Communications Security (CCS), pages 286–296, 2007

work page 2007
[17]

Butler W. Lampson. A note on the confinement problem.Communications of the ACM, 16(10):613–615, 1973

work page 1973
[18]

Integrating flexible support for security policies into the linux operating system

Peter Loscocco and Stephen Smalley. Integrating flexible support for security policies into the linux operating system. InUSENIX Annual Technical Conference, 2001

work page 2001
[19]

Enclawed: A verifiable, human-in-the-loop runtime for llm agents, 2026

Alfredo Metere. Enclawed: A verifiable, human-in-the-loop runtime for llm agents, 2026. https: //enclawed.com

work page 2026
[20]

Jonathan K. Millen. Covert channel capacity. InIEEE Symposium on Security and Privacy, pages 60–66, 1987

work page 1987
[21]

George A. Miller. Note on the bias of information estimates. InInformation Theory in Psychology: Problems and Methods, pages 95–100. Free Press, 1955

work page 1955
[22]

ATLAS: Adversarial threat landscape for artificial-intelligence systems, 2024

MITRE Corporation. ATLAS: Adversarial threat landscape for artificial-intelligence systems, 2024. https://atlas.mitre.org/

work page 2024
[23]

Owasp top 10 for large language model applications, 2025.https://owasp.org/ www-project-top-10-for-large-language-model-applications/

OWASP Foundation. Owasp top 10 for large language model applications, 2025.https://owasp.org/ www-project-top-10-for-large-language-model-applications/

work page 2025
[24]

Saltzer and Michael D

Jerome H. Saltzer and Michael D. Schroeder. The protection of information in computer systems. Proceedings of the IEEE, 63(9):1278–1308, 1975

work page 1975
[25]

Springer, 2012

Asaf Shabtai, Yuval Elovici, and Lior Rokach.A Survey of Data Leakage Detection and Prevention Solutions. Springer, 2012

work page 2012
[26]

Claude E. Shannon. A mathematical theory of communication.Bell System Technical Journal, 27(3): 379–423, 1948

work page 1948
[27]

On the foundations of quantitative information flow

Geoffrey Smith. On the foundations of quantitative information flow. InFoundations of Software Science and Computation Structures (FoSSaCS), pages 288–302, 2009. 25

work page 2009
[28]

Orange Book

U.S. Department of Defense. Trusted computer system evaluation criteria (dod 5200.28-std). Technical report, U.S. Department of Defense, 1985. The “Orange Book”

work page 1985
[29]

John C. Wray. An analysis of covert timing channels. InIEEE Symposium on Security and Privacy, pages 2–7, 1991

work page 1991
[30]

Making information flow explicit in HiStar

Nickolai Zeldovich, Silas Boyd-Wickizer, Eddie Kohler, and David Mazières. Making information flow explicit in HiStar. InUSENIX Symposium on Operating Systems Design and Implementation (OSDI), 2006

work page 2006
[31]

Ziegler, Yuntian Deng, and Alexander M

Zachary M. Ziegler, Yuntian Deng, and Alexander M. Rush. Neural linguistic steganography. In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019. 26

work page 2019

[1] [1]

Elliott Bell and Leonard J

D. Elliott Bell and Leonard J. LaPadula. Secure computer system: Unified exposition and multics interpretation. Technical Report MTR-2997 Rev. 1, The MITRE Corporation, 1976

work page 1976

[2] [2]

Bernstein, Niels Duif, Tanja Lange, Peter Schwabe, and Bo-Yin Yang

Daniel J. Bernstein, Niels Duif, Tanja Lange, Peter Schwabe, and Bo-Yin Yang. High-speed high-security signatures.Journal of Cryptographic Engineering, 2(2):77–89, 2012

work page 2012

[3] [3]

Trojan source: Invisible vulnerabilities

Nicholas Boucher and Ross Anderson. Trojan source: Invisible vulnerabilities. InUSENIX Security Symposium, 2023

work page 2023

[4] [4]

Brodley, and Clay Shields

Serdar Cabuk, Carla E. Brodley, and Clay Shields. Ip covert timing channels: Design and detection. In ACM Conference on Computer and Communications Security (CCS), pages 178–187, 2004

work page 2004

[5] [5]

Zico Kolter, Jakob Foerster, and D

Christian Schroeder de Witt, Samuel Sokota, J. Zico Kolter, Jakob Foerster, and D. J. Strouse. Perfectly secure steganography using minimum entropy coupling. InInternational Conference on Learning Representations (ICLR), 2023

work page 2023

[6] [6]

Dorothy E. Denning. A lattice model of secure information flow.Communications of the ACM, 19(5): 236–243, 1976

work page 1976

[7] [7]

Denning and Peter J

Dorothy E. Denning and Peter J. Denning. Certification of programs for secure information flow. Communications of the ACM, 20(7):504–513, 1977

work page 1977

[8] [8]

Inaudible sound as a covert channel in mobile devices

Luke Deshotels. Inaudible sound as a covert channel in mobile devices. InUSENIX Workshop on Offensive Technologies (WOOT), 2014. 24

work page 2014

[9] [9]

Dyer, Scott E

Kevin P. Dyer, Scott E. Coull, Thomas Ristenpart, and Thomas Shrimpton. Peek-a-boo, i still see you: Why efficient traffic analysis countermeasures fail. InIEEE Symposium on Security and Privacy, pages 332–346, 2012

work page 2012

[10] [10]

Ascii smuggling and hidden prompt instructions: Unicode tag characters in llm applications, 2024.https://embracethered.com/blog/

Embrace the Red. Ascii smuggling and hidden prompt instructions: Unicode tag characters in llm applications, 2024.https://embracethered.com/blog/

work page 2024

[11] [11]

Detecting LSB steganography in color and gray-scale images

Jessica Fridrich, Miroslav Goljan, and Rui Du. Detecting LSB steganography in color and gray-scale images. InIEEE MultiMedia, volume 8, pages 22–28, 2001

work page 2001

[12] [12]

Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injection

Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injection. InACM Workshop on Artificial Intelligence and Security (AISec), 2023

work page 2023

[13] [13]

Neuhoff, editors.The Sonification Handbook

Thomas Hermann, Andy Hunt, and John G. Neuhoff, editors.The Sonification Handbook. Logos Verlag, 2011

work page 2011

[14] [14]

Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations

Hakan Inan et al. Llama Guard: Llm-based input-output safeguard for human-ai conversations. In arXiv:2312.06674, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[15] [15]

Kemmerer

Richard A. Kemmerer. Shared resource matrix methodology: An approach to identifying storage and timing channels. InACM Transactions on Computer Systems, volume 1, pages 256–277, 1983

work page 1983

[16] [16]

An information-theoretic model for adaptive side-channel attacks

Boris Köpf and David Basin. An information-theoretic model for adaptive side-channel attacks. InACM Conference on Computer and Communications Security (CCS), pages 286–296, 2007

work page 2007

[17] [17]

Butler W. Lampson. A note on the confinement problem.Communications of the ACM, 16(10):613–615, 1973

work page 1973

[18] [18]

Integrating flexible support for security policies into the linux operating system

Peter Loscocco and Stephen Smalley. Integrating flexible support for security policies into the linux operating system. InUSENIX Annual Technical Conference, 2001

work page 2001

[19] [19]

Enclawed: A verifiable, human-in-the-loop runtime for llm agents, 2026

Alfredo Metere. Enclawed: A verifiable, human-in-the-loop runtime for llm agents, 2026. https: //enclawed.com

work page 2026

[20] [20]

Jonathan K. Millen. Covert channel capacity. InIEEE Symposium on Security and Privacy, pages 60–66, 1987

work page 1987

[21] [21]

George A. Miller. Note on the bias of information estimates. InInformation Theory in Psychology: Problems and Methods, pages 95–100. Free Press, 1955

work page 1955

[22] [22]

ATLAS: Adversarial threat landscape for artificial-intelligence systems, 2024

MITRE Corporation. ATLAS: Adversarial threat landscape for artificial-intelligence systems, 2024. https://atlas.mitre.org/

work page 2024

[23] [23]

Owasp top 10 for large language model applications, 2025.https://owasp.org/ www-project-top-10-for-large-language-model-applications/

OWASP Foundation. Owasp top 10 for large language model applications, 2025.https://owasp.org/ www-project-top-10-for-large-language-model-applications/

work page 2025

[24] [24]

Saltzer and Michael D

Jerome H. Saltzer and Michael D. Schroeder. The protection of information in computer systems. Proceedings of the IEEE, 63(9):1278–1308, 1975

work page 1975

[25] [25]

Springer, 2012

Asaf Shabtai, Yuval Elovici, and Lior Rokach.A Survey of Data Leakage Detection and Prevention Solutions. Springer, 2012

work page 2012

[26] [26]

Claude E. Shannon. A mathematical theory of communication.Bell System Technical Journal, 27(3): 379–423, 1948

work page 1948

[27] [27]

On the foundations of quantitative information flow

Geoffrey Smith. On the foundations of quantitative information flow. InFoundations of Software Science and Computation Structures (FoSSaCS), pages 288–302, 2009. 25

work page 2009

[28] [28]

Orange Book

U.S. Department of Defense. Trusted computer system evaluation criteria (dod 5200.28-std). Technical report, U.S. Department of Defense, 1985. The “Orange Book”

work page 1985

[29] [29]

John C. Wray. An analysis of covert timing channels. InIEEE Symposium on Security and Privacy, pages 2–7, 1991

work page 1991

[30] [30]

Making information flow explicit in HiStar

Nickolai Zeldovich, Silas Boyd-Wickizer, Eddie Kohler, and David Mazières. Making information flow explicit in HiStar. InUSENIX Symposium on Operating Systems Design and Implementation (OSDI), 2006

work page 2006

[31] [31]

Ziegler, Yuntian Deng, and Alexander M

Zachary M. Ziegler, Yuntian Deng, and Alexander M. Rush. Neural linguistic steganography. In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019. 26

work page 2019