pith. sign in

arxiv: 2606.29389 · v1 · pith:7UXUUP7Hnew · submitted 2026-06-28 · 💻 cs.CR · cs.LG

Exploring the Cryptographic Limits of Transformer Networks

Pith reviewed 2026-06-30 07:27 UTC · model grok-4.3

classification 💻 cs.CR cs.LG
keywords transformerscryptographythreshold circuitsKeccakMerkle-DamgardMerkle treesscaling lawssteganography
0
0 comments X

The pith

Transformers of given depth and width cannot implement arbitrary cryptographic functions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper derives constructive upper bounds showing what cryptographic functions a transformer of fixed depth and width can or cannot realize. It first builds threshold circuits for Keccak, Merkle-Damgård constructions, and Merkle trees, then maps those circuits onto transformer architectures using no-attention and tokens-as-gates schemes. Scaling laws are obtained that translate circuit size directly into required transformer width and depth. A sympathetic reader would care because the bounds give a way to decide whether transformer-based agents could generate source-free randomness for steganographic or other hidden communication tasks. The approach supplies structural guarantees on capacity instead of relying solely on empirical checks.

Core claim

By generating threshold circuits for Keccak functions, Merkle-Damgård constructions and Merkle trees, then mapping the circuits to transformer architectures via no-attention and tokens-as-gates schemes, the work obtains verified scaling laws for width and depth that yield constructive upper bounds on the cryptographic functions any transformer of those dimensions can compute.

What carries the argument

The exact mapping from threshold-circuit representations of cryptographic primitives onto transformer layers that preserves functionality.

If this is right

  • Transformers below the derived bounds cannot perform the listed cryptographic operations and therefore cannot use them for source-free randomness in steganography.
  • Capability evaluations of transformer-based AI systems can apply these bounds to rule out certain malicious behaviors.
  • The same circuit-to-transformer mapping technique can be reused to obtain bounds for additional functions beyond the three constructions studied.
  • Each cryptographic construction has its own explicit width and depth scaling law that must be met for exact implementation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same circuit-mapping method could be extended to arithmetic or logical primitives to produce broader capacity limits for transformers.
  • Actual trained models might approximate the functions with fewer resources than the exact constructive mappings require.
  • Hardware designers could use the scaling laws to set minimum transformer sizes needed for security-related computations.

Load-bearing premise

Saturated transformers can be treated as threshold circuits whose generated circuits for the cryptographic constructions can be mapped to the proposed transformer architectures while preserving exact functionality.

What would settle it

A concrete transformer architecture whose depth and width fall below the derived scaling-law bounds yet still computes one of the three cryptographic constructions exactly.

Figures

Figures reproduced from arXiv: 2606.29389 by Andis Draguns, Christian Schroeder de Witt, Isaac Robinson, Philip Torr, Stefan Domunco.

Figure 1
Figure 1. Figure 1: Circuit compiled for Keccak function with [PITH_FULL_IMAGE:figures/full_fig_p011_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Plot for Keccak depth scaling. Differences between predicted and actual values come from [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Plot for Keccak width scaling each Keccak state has 50 entries, two leaves being processed in parallel gives 2 · 50 = 100 nodes. The next six layers perform the Keccak toy function f . Each of these layers has a width twice as large as its corresponding layer in the circuit representation of f . Layers 9 and 10, the first layers after fleaves has been computed, XOR the previous results and thus start compu… view at source ↗
Figure 4
Figure 4. Figure 4: Circuit compiled for MD with compression function Keccak with [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Plot for MD depth scaling E Tokens-as-Gates Correctness For the Tokens-as-Gates mapping we get the following flow of computation for the attention sublayer: qi = Qxi ki = Kxi vi = V xi si, j = q T i kj ai, j = ( 1 if si, j = 1 0 otherwise outi = ∑ j ai, j vj Looking at equations 26, 25, 27 note that K is a fixed matrix. When applied to xj = ej , it returns a vector whose i-th entry is 1 if gate j is a pred… view at source ↗
Figure 6
Figure 6. Figure 6: Plot for MD width scaling [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Circuit compiled for MT with compression function Keccak with [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Plot for MT depth scaling 1 2 4 8 16 Number of leaves nleaves 2000 4000 6000 8000 Circuit width Width vs nleaves (fixed: log2 w = 1, nrounds = 1, r = 12, c = 38, widthc = 550, leaf bits = 4) Actual Predicted: nleaves widthc [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Plot for MT width scaling 17 [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗
read the original abstract

In recent work it has been shown that colluding AI agents can use steganographic methods to exchange malicious information. Whether a transformer can implement steganographic methods depends on what cryptographic functions it can implement, since a transformer that can implement a cryptographic function within its layers has source-free randomness access. Despite existing circuit-complexity results, no prior work maps specific cryptographic constructions to transformer architectures. As Merrill et al. have shown that saturated transformers can be seen as threshold circuits, we first generate threshold circuits for three different cryptographic constructions (Keccak functions, Merkle--Damgard constructions and Merkle Trees) and then map these circuits to different transformer architectures. We derive verified scaling laws for the width and depth of the circuits which implement each cryptographic construction and propose two different mappings: no-attention mapping, tokens-as-gates mapping. Beyond its security implications, this work contributes to by establishing a methodology for deriving structural guarantees on transformer computational capacity. Specifically, we derive constructive upper bounds on what a transformer of a given depth and width could plausibly compute, providing a principled foundation for capability evaluations of transformer-based AI systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper constructs threshold circuits for Keccak functions, Merkle-Damgård constructions, and Merkle Trees, maps them to two transformer architectures (no-attention mapping and tokens-as-gates mapping), derives scaling laws for width and depth, and claims this provides constructive upper bounds on the computational capacity of transformers of given depth and width, with implications for steganographic capabilities in AI agents.

Significance. The constructive approach, including explicit circuit generations and mappings to transformers, offers a concrete methodology for linking cryptographic primitives to neural architectures. If the functionality-preserving mappings hold, this could inform capability evaluations, though the interpretation as upper bounds requires clarification as the results primarily establish lower bounds on required transformer sizes.

major comments (2)
  1. [Abstract] Abstract: The central claim that the work derives 'constructive upper bounds on what a transformer of a given depth and width could plausibly compute' is not supported by the described constructions. Generating threshold circuits and mapping them demonstrates that transformers of sufficient size can implement these functions, providing lower bounds on capacity rather than upper bounds.
  2. [Mapping to transformer architectures] Mapping to transformer architectures: The paper relies on the assumption from Merrill et al. that saturated transformers can be viewed as threshold circuits, and that the generated circuits can be mapped while preserving exact functionality. This mapping step is load-bearing for the claims but the abstract provides no verification details, circuit constructions, or error analysis to assess its validity.
minor comments (1)
  1. The abstract mentions 'verified scaling laws' but does not specify the verification method or provide the laws explicitly in the summary.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful review and for highlighting points that require clarification. We respond to each major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that the work derives 'constructive upper bounds on what a transformer of a given depth and width could plausibly compute' is not supported by the described constructions. Generating threshold circuits and mapping them demonstrates that transformers of sufficient size can implement these functions, providing lower bounds on capacity rather than upper bounds.

    Authors: We agree that the constructions demonstrate sufficiency: transformers meeting the derived width and depth scaling can implement the cryptographic functions. This yields lower bounds on the minimal resources required. The abstract phrasing 'constructive upper bounds on what a transformer of a given depth and width could plausibly compute' was meant to convey that models falling below these scalings cannot implement the functions, but the wording is imprecise and risks overstating the result. We will revise the abstract to state that the work supplies constructive lower bounds on transformer size for these functions (with consequent implications for the capabilities of smaller models). revision: yes

  2. Referee: [Mapping to transformer architectures] Mapping to transformer architectures: The paper relies on the assumption from Merrill et al. that saturated transformers can be viewed as threshold circuits, and that the generated circuits can be mapped while preserving exact functionality. This mapping step is load-bearing for the claims but the abstract provides no verification details, circuit constructions, or error analysis to assess its validity.

    Authors: The manuscript body contains the explicit threshold-circuit constructions for Keccak, Merkle-Damgård, and Merkle trees, together with the two architecture mappings. These rely on the saturated-transformer equivalence established by Merrill et al. We will revise the abstract to reference the verification approach used and will add an appendix supplying sample circuit diagrams, the precise token-to-gate encoding, and any bounded approximation error introduced by the mapping, so that readers can directly evaluate functionality preservation. revision: partial

Circularity Check

0 steps flagged

No circularity; external citation and constructive circuit mappings are independent of target bounds

full rationale

The derivation begins from the external Merrill et al. result that saturated transformers equal threshold circuits. It then explicitly constructs threshold circuits for Keccak, Merkle-Damgård and Merkle trees, derives their width/depth scaling laws from those constructions, and maps the circuits to two transformer architectures while preserving functionality. None of these steps reduce by definition or by self-citation to the claimed upper bounds; the scaling laws are outputs of the circuit constructions, not inputs. No fitted parameters are relabeled as predictions, no ansatz is smuggled via self-citation, and no uniqueness theorem from the same authors is invoked. The directional claim (upper vs. lower bounds) is a separate correctness question and does not create circularity in the derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the external result that saturated transformers equal threshold circuits; no free parameters or new entities are introduced in the abstract.

axioms (1)
  • domain assumption Saturated transformers can be seen as threshold circuits (Merrill et al.)
    Invoked to justify converting cryptographic circuits into transformer architectures.

pith-pipeline@v0.9.1-grok · 5730 in / 1131 out tokens · 48516 ms · 2026-06-30T07:27:34.083842+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

17 extracted references · 8 canonical work pages · 2 internal anchors

  1. [1]

    Sumeet Ramesh Motwani, Mikhail Baranchuk, Martin Strohmeier, Vijay Bolina, Philip H. S. Torr, Lewis Hammond, and Christian Schroeder de Witt. Secret collusion among ai agents: Multi-agent deception via steganography. InAdvances in Neural Information Processing Systems, 2024

  2. [2]

    William Merrill, Ashish Sabharwal, and Noah A. Smith. Saturated transformers are constant-depth threshold circuits. InTransactions of the Association for Computational Linguistics, 2022

  3. [3]

    Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Open X-Embodiment Collaboration. Open X-embodiment: Robotic learning datasets and RT-X models.arXiv preprint arXiv:2310.08864, 2023

  4. [4]

    Revealing robust oil and gas company macro-strategies using deep multi-agent reinforcement learning.arXiv preprint arXiv:2211.11043, 2022

    Dylan Radovic, Lucas Kruitwagen, Christian Schroeder de Witt, Ben Caldecott, Shane Tomlinson, and Mark Workman. Revealing robust oil and gas company macro-strategies using deep multi-agent reinforcement learning.arXiv preprint arXiv:2211.11043, 2022

  5. [5]

    An information-theoretic model for steganography.Information and Computation, 192(1):41–56, July 2004

    Christian Cachin. An information-theoretic model for steganography.Information and Computation, 192(1):41–56, July 2004. doi: 10.1016/j.ic.2004.02.003

  6. [6]

    Unelicitable backdoors in language models via cryptographic transformer circuits

    Andis Draguns, Andrew Gritsevskiy, Sumeet Ramesh Motwani, and Christian Schroeder de Witt. Unelicitable backdoors in language models via cryptographic transformer circuits. InAdvances in Neural Information Processing Systems, 2024

  7. [7]

    Manning, Christopher Ré, Diana Acosta-Navas, Drew A

    Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu...

  8. [8]

    URLhttps://arxiv.org/abs/2211.09110

  9. [9]

    Gomez, Łukasz Kaiser, and Illia Polosukhin

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in Neural Information Processing Systems (NeurIPS), volume 30, 2017. 9

  10. [10]

    GLU variants improve transformer, 2020

    Noam Shazeer. GLU variants improve transformer, 2020

  11. [11]

    Threshold circuits of bounded depth.Journal of Computer and System Sciences, 46(2):129–154, 1993

    András Hajnal, Wolfgang Maass, Pavel Pudlák, Mario Szegedy, and György Turán. Threshold circuits of bounded depth.Journal of Computer and System Sciences, 46(2):129–154, 1993. doi: 10.1016/0022-0000(93)90001-D

  12. [12]

    Ralph C. Merkle. One way hash functions and DES. InAdvances in Cryptology — CRYPTO ’89, volume 435 ofLecture Notes in Computer Science, pages 428–446. Springer, 1989. doi: 10.1007/0-387-34805-0_40

  13. [13]

    A design principle for hash functions

    Ivan Bjerre Damgård. A design principle for hash functions. InAdvances in Cryptology — CRYPTO ’89, volume 435 ofLecture Notes in Computer Science, pages 416–427. Springer, 1989. doi: 10.1007/0-387-34805-0_39

  14. [14]

    Secrecy, Authentication, and Public Key Systems

    Ralph C. Merkle. A certified digital signature. In Gilles Brassard, editor,Advances in Cryptology — CRYPTO ’89 Proceedings, volume 435 ofLecture Notes in Computer Science, pages 218–238, New York, NY , 1989. Springer. doi: 10.1007/0-387-34805-0\_21. Work originally presented in Merkle’s 1979 Stanford PhD thesis “Secrecy, Authentication, and Public Key Systems”

  15. [15]

    The sponge and duplex constructions.https://keccak.team/sponge_duplex.html, 2011

    Guido Bertoni, Joan Daemen, Michaël Peeters, and Gilles Van Assche. The sponge and duplex constructions.https://keccak.team/sponge_duplex.html, 2011

  16. [16]

    Reifier: Compile algorithms into neural networks

    Andis Draguns. Reifier: Compile algorithms into neural networks. https://github.com/ contramont/reifier

  17. [17]

    Part-b-project-results

    Crroco. Part-b-project-results. https://github.com/Crroco/Part-B-Project-Results/ tree/main. A Keccak Operation Definitions The state a defined in Section 2.6 has length 25w where w∈ {1,2,4,8,16,32,64} , depending on the version of Keccak that is used. Capacity has lengthcand rate has lengthr. Define GF(p) for p a prime number as the Galois Field with p e...