Discovering a Zeta Map Algorithm on Dyck Paths via Mechanistic Interpretability

Blake Jackson; Kyu-Hwan Lee; Xiaoyu Huang

arxiv: 2605.30482 · v1 · pith:5YGQ4PL7new · submitted 2026-05-28 · 💻 cs.LG

Discovering a Zeta Map Algorithm on Dyck Paths via Mechanistic Interpretability

Xiaoyu Huang , Blake Jackson , Kyu-Hwan Lee This is my paper

Pith reviewed 2026-06-29 08:26 UTC · model grok-4.3

classification 💻 cs.LG

keywords Dyck pathszeta mapscaffolding mapmechanistic interpretabilitytransformercombinatorial bijectionq,t-Catalan numbers

0 comments

The pith

Mechanistic interpretability of a small transformer yields an explicit peak-centered traversal algorithm on Dyck paths that matches the zeta map up to reversal.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper trains a one-layer one-head encoder-decoder transformer on the zeta map, a classical bijection on Dyck paths tied to the q,t-Catalan numbers. Analysis of encoder representations and decoder cross-attention reveals a level-based mechanism that the authors translate into the scaffolding map, an explicit algorithm that traverses Dyck paths by centering on peaks. They prove this new algorithm agrees with the zeta map after accounting for a reversal convention in the labeling. A reader would care because the result converts opaque model internals into a self-contained combinatorial construction that can be checked by hand without any neural network.

Core claim

The central claim is that level signals made linearly accessible in the encoder and structured selection performed by decoder cross-attention can be read out as the scaffolding map, a peak-centered traversal algorithm for Dyck paths. The authors prove that this algorithm coincides with the zeta map modulo a reversal convention in the labeling, furnishing an explicit, human-verifiable description of the classical bijection.

What carries the argument

The scaffolding map: an explicit peak-centered traversal algorithm for Dyck paths extracted from level-based transformer signals.

If this is right

The zeta map admits a direct peak-centered implementation that does not rely on recursion or generating functions.
Path levels alone suffice to determine the image of any Dyck path under the zeta map.
The equivalence can be verified by applying both maps to the same finite set of Dyck paths and confirming agreement after the reversal step.
The bijection is independent of the particular training run once the level signals are extracted.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same extraction procedure could be tried on other Catalan bijections to obtain additional explicit algorithms.
If the scaffolding map generalizes cleanly, it may simplify proofs that rely on the zeta map by replacing it with a traversal description.
Testing the map on randomly generated large Dyck paths would provide an independent check outside the model's training distribution.

Load-bearing premise

The level-based signals extracted from the model's encoder and decoder can be translated into a complete combinatorial algorithm whose correctness holds independently of training dynamics and requires no further adjustments.

What would settle it

Any Dyck path on which the scaffolding map, after reversal adjustment, produces an output different from the zeta map.

Figures

Figures reproduced from arXiv: 2605.30482 by Blake Jackson, Kyu-Hwan Lee, Xiaoyu Huang.

**Figure 1.** Figure 1: An example of Dyck path of semilength 5. The q, t-Catalan numbers refine the Catalan numbers and admit two Dyck-path expansions involving the statistics area, bounce, and dinv: Cn(q, t) = X w∈Dyck(n) q area(w) t bounce(w) = X w∈Dyck(n) q dinv(w) t area(w) . (1) These formulas were discovered independently by (Haglund, 2003; Haiman, 2000). Here area(w) counts the cells between the path and the diagonal, bo… view at source ↗

**Figure 2.** Figure 2: Representative decoder cross-attention matrices for the Minimal Dyck Transformer. The recurring sparse patterns indicate that generation is organized by level-based selection rather than by an unstructured lookup table. accessible. This supports a two-stage interpretation: the encoder computes level-like information, and the decoder uses it to select and traverse positions. These observations suggest that … view at source ↗

**Figure 3.** Figure 3: Example of decoder cross-attention and its causal role. Left: input, target, and model-generated Dyck paths, with attended input positions highlighted in yellow. Middle: sparse cross-attention used when generating the first output step on the left. Right: causal ablation results showing negligible decreases in accuracy, measured by comparing the generated output path with the correct target path up to the … view at source ↗

**Figure 4.** Figure 4: Cross-attention matrices at a fixed output step 23 generation for a batch of random samples, illustrating that the triangular shift pattern occurs consistently across examples. 11 [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

**Figure 5.** Figure 5: Level-attention comparison across examples. Decoder cross-attention at the first output step consistently concentrates on high-level input positions. 12 [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 6.** Figure 6: Encoder self-attention matrices across a batch of examples. The recurring block-like structure reflects the binary organization of Dyck words and suggests that the encoder forms structured position-level representations. 13 [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗

**Figure 7.** Figure 7: PCA of the learned encoder positional embeddings. The second principal component largely follows the ordering of input positions, while the first component separates some early positions from the main cluster [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗

**Figure 8.** Figure 8: PCA of the learned decoder positional embeddings. Decoder positions form a more regular spiral trajectory in the first two principal components (Wennberg & Henter, 2024; Vaswani et al., 2017). 14 [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗

**Figure 9.** Figure 9: Dyck path with marked positions used in the proof. In step (3)(a) of the scaffolding algorithm, the queues Qb, Qb−1, . . . , Q0 are formed, where b = max{ℓi} = max{ai}+ 1 as before. For k = b, b − 1, . . . , 0, each Qk consists of positions i1, i2, . . . . Replace them with the corresponding bi1 , bi2 , . . . , and denote the resulting sequence by Q′ k . For example, in Example A.7, we have Q4 = (8), Q3 = … view at source ↗

read the original abstract

Machine learning is increasingly used in mathematical discovery, but in mathematics the desired output is often not a prediction itself, but an explicit construction that can be checked independently. We study this setting through the zeta map on Dyck paths, a classical bijection in the combinatorics of the q,t-Catalan numbers. We train a deliberately small one-layer, one-head encoder-decoder transformer on this map and analyze its learned computation using mechanistic interpretability tools, including decoder cross-attention analysis, linear probing, and causal intervention. The analysis reveals a level-based mechanism: encoder representations make path levels linearly accessible, while the decoder selects and traverses input positions in a structured way. Translating these signals into combinatorics leads to the scaffolding map, an explicit peak-centered traversal algorithm for Dyck paths. We prove that this algorithm agrees with the zeta map, modulo a reversal convention in the labeling. This gives a controlled example of AI-assisted mathematical discovery in which mechanistic interpretability turns model behavior into a precise, human-verifiable combinatorial algorithm.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

They extract an explicit scaffolding map from a tiny transformer's internals and prove it matches the zeta map, giving a clean example of interpretability-to-algorithm.

read the letter

The main point is that a one-layer one-head encoder-decoder transformer trained on the zeta map for Dyck paths yields, through attention and probe analysis, an explicit peak-centered traversal rule they call the scaffolding map, which they then prove equivalent to the classical zeta map up to reversal.

What works is the controlled pipeline. The model is small enough for full mechanistic inspection, encoder representations make levels linearly readable, decoder cross-attention selects positions in a structured order, and the authors convert those signals into a discrete algorithm that stands on its own. Ending with an independent combinatorial proof rather than a fitted claim is the right move; it keeps the result checkable outside the model.

The soft spot sits in the translation step. The abstract says the signals lead to the algorithm, but it is not obvious from the description whether every rule in the scaffolding map follows directly from the linear probes and attention weights or whether some discretization, tie-breaking, or rule-formulation choice was added by hand. If the latter, the independence from prior combinatorial knowledge and from training dynamics is less secure, even though the later proof of equivalence can still hold. The stress-test note on unstated choices lands here.

This is for people working on mechanistic interpretability in combinatorial settings or on AI-assisted discovery where the output must be human-verifiable. A reader who wants a documented case of model internals becoming an explicit algorithm will find the example useful. It deserves peer review because it supplies both a new explicit construction and a proof; the extraction details are the part that needs more scrutiny, not the overall framing.

Referee Report

2 major / 2 minor

Summary. The paper trains a deliberately small one-layer, one-head encoder-decoder transformer on the zeta map bijection between Dyck paths. Mechanistic interpretability tools (decoder cross-attention, linear probes, causal interventions) are used to identify a level-based mechanism in the encoder representations and decoder selection patterns. These signals are translated into an explicit peak-centered traversal algorithm called the scaffolding map; an independent combinatorial proof is given that this algorithm agrees with the zeta map up to a reversal convention on the labeling.

Significance. If the extraction from model signals to the scaffolding map is shown to be complete and free of unstated human adjustments, the work supplies a controlled, reproducible example of AI-assisted combinatorial discovery in which the final output is an independently verifiable algorithm rather than a fitted predictor. The separation between the interpretability step and the subsequent machine-checkable proof is a methodological strength.

major comments (2)

[translation from model signals to scaffolding map] The section describing the translation from encoder level signals and decoder cross-attention patterns to the scaffolding map rules does not explicitly enumerate the discretization thresholds, tie-breaking conventions, or rule-formulation steps used to convert continuous activations into the discrete peak-centered traversal. Without this enumeration it is impossible to verify that the algorithm definition is fully entailed by the linear probes and attention weights rather than by additional post-hoc choices.
[definition of scaffolding map] The claim that the scaffolding map is 'gap-free' and independent of training dynamics rests on the completeness of the extracted traversal rules. The manuscript should supply a case-by-case verification (or a machine-checked enumeration) that every Dyck path is handled by the extracted rules exactly as described, before the equivalence proof begins.

minor comments (2)

Notation for path levels and peak indices should be introduced once with a small illustrative figure rather than scattered across the interpretability and combinatorial sections.
[main theorem] The reversal convention on labeling is stated in the abstract and conclusion but should be made explicit in the statement of the main theorem so that readers can check the equivalence without backtracking.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments highlighting the need for greater explicitness in the interpretability-to-algorithm pipeline. We address both major points below and will incorporate clarifications and additional verification material in the revision.

read point-by-point responses

Referee: [translation from model signals to scaffolding map] The section describing the translation from encoder level signals and decoder cross-attention patterns to the scaffolding map rules does not explicitly enumerate the discretization thresholds, tie-breaking conventions, or rule-formulation steps used to convert continuous activations into the discrete peak-centered traversal. Without this enumeration it is impossible to verify that the algorithm definition is fully entailed by the linear probes and attention weights rather than by additional post-hoc choices.

Authors: We agree that the translation steps require explicit enumeration for full reproducibility. The revised manuscript will add a dedicated subsection that lists: (i) the precise numerical thresholds applied to linear probe outputs for level identification, (ii) all tie-breaking conventions (e.g., left-to-right selection among equal-level peaks), and (iii) the ordered sequence of operations that convert attention patterns into the traversal rules. This addition will document the mapping without post-hoc adjustments beyond what the model signals directly support. revision: yes
Referee: [definition of scaffolding map] The claim that the scaffolding map is 'gap-free' and independent of training dynamics rests on the completeness of the extracted traversal rules. The manuscript should supply a case-by-case verification (or a machine-checked enumeration) that every Dyck path is handled by the extracted rules exactly as described, before the equivalence proof begins.

Authors: The independence from training dynamics is justified by the deterministic combinatorial definition of the scaffolding map extracted from the model and the subsequent machine-checkable equivalence proof, which holds for arbitrary Dyck paths. Nevertheless, to increase transparency we will append a machine-checked enumeration verifying that the extracted rules cover all Dyck paths of semilength ≤ 8 without gaps or unstated exceptions. This verification will appear before the main equivalence proof. The proof itself then establishes correctness for all sizes; an exhaustive enumeration of the infinite set is neither feasible nor necessary once the rules are shown to be complete for the finite cases and the bijection is proven. revision: partial

Circularity Check

0 steps flagged

No circularity; derivation ends in independent combinatorial proof

full rationale

The paper trains a small transformer on the zeta map, applies mechanistic interpretability (cross-attention, linear probes, interventions) to extract level-based signals, translates those signals into an explicit scaffolding map algorithm, and then supplies a separate mathematical proof that the scaffolding map agrees with the zeta map (modulo reversal). The final verification step is a human-checkable combinatorial argument whose correctness does not depend on model parameters, fitted thresholds, or self-citation chains. No quoted equation or definition reduces the claimed algorithm to a quantity defined only inside the training loop or by prior self-citation; the interpretability analysis serves as a discovery heuristic whose output is externally validated.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claim rests on standard properties of Dyck paths and the classical zeta map definition, plus the assumption that model internals can be faithfully mapped to a combinatorial procedure. No free parameters or invented entities are mentioned in the abstract.

axioms (2)

standard math Dyck paths are lattice paths from (0,0) to (2n,0) with steps (1,1) and (1,-1) that never go below the x-axis.
Invoked as the domain of the zeta map bijection.
standard math The zeta map is a well-defined bijection on Dyck paths used in the theory of q,t-Catalan numbers.
The target object the model is trained to reproduce and the object the scaffolding map is proven to match.

invented entities (1)

scaffolding map no independent evidence
purpose: Explicit peak-centered traversal algorithm extracted from model signals and proven equivalent to zeta map.
Newly introduced combinatorial object whose definition is derived from the transformer analysis.

pith-pipeline@v0.9.1-grok · 5710 in / 1656 out tokens · 20850 ms · 2026-06-29T08:26:23.206391+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

13 extracted references · 1 canonical work pages · 1 internal anchor

[1]

Charton, F., Ellenberg, J., Wagner, A., and Williamson, G

URL https://openreview.net/forum? id=DF6jVG4fG8. Charton, F., Ellenberg, J., Wagner, A., and Williamson, G. Patternboost: Constructions in mathematics with a little help from ai, 2024. Chughtai, B., Chan, L., and Nanda, N. A toy model of universality: Reverse engineering how networks learn group operations. InInternational Conference on Ma- chine Learning...

2024
[2]

Friedman, D., Wettig, A., and Chen, D

URL https://openreview.net/forum? id=PUQ0rrmuDM. Friedman, D., Wettig, A., and Chen, D. Learning trans- former programs.Advances in Neural Information Pro- cessing Systems, 36:49044–49067, 2023. Gukov, S., Halverson, J., Ruehle, F., and Sułkowski, P. Learning to unknot.Machine Learning: Science and Technology, 2(2):025035, 2021. Haglund, J. Conjectured st...

2023
[3]

AlphaEvolve: A coding agent for scientific and algorithmic discovery

doi: 10.1016/S0001-8708(02)00061-0. URL https://www.sciencedirect.com/ science/article/pii/S0001870802000610. Haiman, M. The q,t–Catalan Numbers and the Alternat- ing Component of the Diagonal Harmonics. Preprint, University of California, Berkeley, 2000. Hashemi, B., Corominas, R., and Giacchetto, A. Can trans- formers do enumerative geometry? InInternat...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1016/s0001-8708(02)00061-0 2000
[4]

Shoot a billiard from(0,0)heading North
[5]

When you hit the start of an East step, turn East and travel until you hit the diagonal
[6]

Turn North and repeat until you reach(n, n)
[7]

,(jk, jk),(n, n)

Record the places you hit the diagonal(0,0),(j 1, j1), . . . ,(jk, jk),(n, n). Then bounce(w) = kX i=1 (n−j i). Definition A.6.Then th q, t-Catalan number is Cn(q, t) = X w∈Dyck(n) qarea(w)tbounce(w) = X w∈Dyck(n) qdinv(w)tarea(w) ∈Z >0[q, t]. 8 Discovering a Zeta Map Algorithm on Dyck Paths via Mechanistic Interpretability A.2. Haglund’s Zeta Map The zet...

2002
[8]

, an ofw(row lengths from bottom)

Compute the area sequencea 1, a2, . . . , an ofw(row lengths from bottom)
[9]

, b, scan the sequencea 1, a2,

For eachk= 0,1, . . . , b, scan the sequencea 1, a2, . . . , an from left to right and record: (a) a0for each occurrence ofk−1, (b) a1for each occurrence ofk. Since the area sequence contains no entries equal to −1 or b, this procedure starts with a 1 when k= 0 and ends with a 0whenk=b. Below is an example of the image of the zeta map of a Dyck path of se...

2024
[10]

a0for each occurrence ofk−1,
[11]

Accordingly, an algorithm forrev◦ζis given by the following

a1for each occurrence ofk. Accordingly, an algorithm forrev◦ζis given by the following. For eachk=b, b−1, . . . ,0, scan the sequencea 1, a2, . . . , an from right to left and record:
[12]

a0for each occurrence ofk,
[13]

From the algorithm of rev◦ζ , for each k=b, b−1,

a1for each occurrence ofk−1. From the algorithm of rev◦ζ , for each k=b, b−1, . . . ,0 , we obtain a subsequence Pk of ˜a(w)consisting of k and k−1 , where ˜a(w) = (an, an−1, . . . , a1). (Recall that we scan a(w) from right to left in the algorithm of rev◦ζ .) For example, the Dyck path in Example A.7 (see Figure 9 below) yields P4 = (3), P 3 = (2,3,2,2,...

[1] [1]

Charton, F., Ellenberg, J., Wagner, A., and Williamson, G

URL https://openreview.net/forum? id=DF6jVG4fG8. Charton, F., Ellenberg, J., Wagner, A., and Williamson, G. Patternboost: Constructions in mathematics with a little help from ai, 2024. Chughtai, B., Chan, L., and Nanda, N. A toy model of universality: Reverse engineering how networks learn group operations. InInternational Conference on Ma- chine Learning...

2024

[2] [2]

Friedman, D., Wettig, A., and Chen, D

URL https://openreview.net/forum? id=PUQ0rrmuDM. Friedman, D., Wettig, A., and Chen, D. Learning trans- former programs.Advances in Neural Information Pro- cessing Systems, 36:49044–49067, 2023. Gukov, S., Halverson, J., Ruehle, F., and Sułkowski, P. Learning to unknot.Machine Learning: Science and Technology, 2(2):025035, 2021. Haglund, J. Conjectured st...

2023

[3] [3]

AlphaEvolve: A coding agent for scientific and algorithmic discovery

doi: 10.1016/S0001-8708(02)00061-0. URL https://www.sciencedirect.com/ science/article/pii/S0001870802000610. Haiman, M. The q,t–Catalan Numbers and the Alternat- ing Component of the Diagonal Harmonics. Preprint, University of California, Berkeley, 2000. Hashemi, B., Corominas, R., and Giacchetto, A. Can trans- formers do enumerative geometry? InInternat...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1016/s0001-8708(02)00061-0 2000

[4] [4]

Shoot a billiard from(0,0)heading North

[5] [5]

When you hit the start of an East step, turn East and travel until you hit the diagonal

[6] [6]

Turn North and repeat until you reach(n, n)

[7] [7]

,(jk, jk),(n, n)

Record the places you hit the diagonal(0,0),(j 1, j1), . . . ,(jk, jk),(n, n). Then bounce(w) = kX i=1 (n−j i). Definition A.6.Then th q, t-Catalan number is Cn(q, t) = X w∈Dyck(n) qarea(w)tbounce(w) = X w∈Dyck(n) qdinv(w)tarea(w) ∈Z >0[q, t]. 8 Discovering a Zeta Map Algorithm on Dyck Paths via Mechanistic Interpretability A.2. Haglund’s Zeta Map The zet...

2002

[8] [8]

, an ofw(row lengths from bottom)

Compute the area sequencea 1, a2, . . . , an ofw(row lengths from bottom)

[9] [9]

, b, scan the sequencea 1, a2,

For eachk= 0,1, . . . , b, scan the sequencea 1, a2, . . . , an from left to right and record: (a) a0for each occurrence ofk−1, (b) a1for each occurrence ofk. Since the area sequence contains no entries equal to −1 or b, this procedure starts with a 1 when k= 0 and ends with a 0whenk=b. Below is an example of the image of the zeta map of a Dyck path of se...

2024

[10] [10]

a0for each occurrence ofk−1,

[11] [11]

Accordingly, an algorithm forrev◦ζis given by the following

a1for each occurrence ofk. Accordingly, an algorithm forrev◦ζis given by the following. For eachk=b, b−1, . . . ,0, scan the sequencea 1, a2, . . . , an from right to left and record:

[12] [12]

a0for each occurrence ofk,

[13] [13]

From the algorithm of rev◦ζ , for each k=b, b−1,

a1for each occurrence ofk−1. From the algorithm of rev◦ζ , for each k=b, b−1, . . . ,0 , we obtain a subsequence Pk of ˜a(w)consisting of k and k−1 , where ˜a(w) = (an, an−1, . . . , a1). (Recall that we scan a(w) from right to left in the algorithm of rev◦ζ .) For example, the Dyck path in Example A.7 (see Figure 9 below) yields P4 = (3), P 3 = (2,3,2,2,...