Evolutionary Discovery of Bivariate Bicycle Codes with LLM-Guided Search

Andrew W. Cross; David Kremer; Ismael Faro; Juan Cruz-Benito

arxiv: 2606.02418 · v1 · pith:VSBHTOGRnew · submitted 2026-06-01 · 🪐 quant-ph · cs.AI

Evolutionary Discovery of Bivariate Bicycle Codes with LLM-Guided Search

Juan Cruz-Benito , Andrew W. Cross , David Kremer , Ismael Faro This is my paper

Pith reviewed 2026-06-28 14:36 UTC · model grok-4.3

classification 🪐 quant-ph cs.AI

keywords quantum LDPC codesbivariate bicycle codesevolutionary searchLLM-guided discoveryCSS codesnon-CSS codescode certificationTanner graph

0 comments

The pith

LLM-guided evolution of code-generating programs yields 465 distinct bivariate-bicycle quantum codes at lengths up to 360.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that language models can iteratively mutate Python programs to generate families of bivariate-bicycle and perturbed bivariate-bicycle quantum LDPC codes, then pass the outputs through a multi-stage certification pipeline. Across five runs the process examined roughly 200,000 candidates in 140 hours and returned 97 CSS codes plus 368 non-CSS variants, some of which match or exceed previously known distance and rate figures. A sympathetic reader would care because the approach replaces hand-crafted algebraic search with an automated, extensible loop that still produces independently verifiable parameters.

Core claim

An LLM-guided evolutionary workflow in which models mutate Python generators for bivariate-bicycle ansatze, followed by GF(2) rank checks, distance certification, MILP bounds, Tanner-graph deduplication, decomposability tests, and local-Clifford equivalence, produces 465 distinct codes at n ≤ 360, including an indecomposable [[288,16,12]] CSS code and perturbed non-CSS codes that attain the gross-code figure of merit at [[144,12,12]].

What carries the argument

LLM-driven mutation of Python programs that output parity-check matrices for bivariate-bicycle and perturbed bivariate-bicycle constructions, evaluated by a staged validation pipeline of linear-algebra rank, distance estimation, MILP, BLISS deduplication, and equivalence checks.

Load-bearing premise

The staged validation pipeline correctly computes ranks, distances, and equivalence classes for every candidate without false positives or missed duplicates.

What would settle it

A single reported code whose actual distance, after independent recomputation, falls below the certified value listed in the paper.

Figures

Figures reproduced from arXiv: 2606.02418 by Andrew W. Cross, David Kremer, Ismael Faro, Juan Cruz-Benito.

**Figure 2.** Figure 2: FIG. 2. Rate–distance landscape across all five campaigns. [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗

**Figure 4.** Figure 4: FIG. 4. Distance revisions across 154 Campaigns 1–3 poly [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗

**Figure 5.** Figure 5: FIG. 5. Coprimality heuristic [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗

**Figure 6.** Figure 6: FIG. 6. Independent convergence on univariate codes. [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗

**Figure 7.** Figure 7: FIG. 7. Over-specialization reduces coverage. [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗

**Figure 1.** Figure 1: FIG. 1. Per-batch distance distributions under the 150,000-trial multi-decoder protocol. Red dashed lines show the global [PITH_FULL_IMAGE:figures/full_fig_p026_1.png] view at source ↗

**Figure 2.** Figure 2: FIG. 2. Fitness trajectories of Campaigns 1–3. [PITH_FULL_IMAGE:figures/full_fig_p029_2.png] view at source ↗

read the original abstract

Quantum LDPC code discovery requires searching large algebraic design spaces while reliably certifying the parameters and equivalence classes of any candidates found. We introduce an LLM-guided evolutionary workflow in which language models mutate Python programs that generate bivariate-bicycle and perturbed bivariate-bicycle code ans\"atze. Across five campaigns, the system performed approximately 1{,}650 evolutionary iterations, screened about $2 \times 10^5$ candidate codes, and required ${\sim}140$ hours of computation and ${\sim}$US\$400 in LLM inference cost. Candidate codes are evaluated through a staged validation pipeline combining $\mathrm{GF}(2)$ rank computation, distance estimation and certification, mixed-integer linear programming, BLISS Tanner-graph deduplication, decomposability analysis, and local-Clifford equivalence checks. At block length $n \leq 360$, the workflow identifies 465 distinct candidate codes: 97 CSS bivariate-bicycle codes and 368 non-CSS perturbed variants. The CSS search recovers known high-performing codes and finds new finite-length representatives, including an indecomposable [[288,16,12]] code and higher-weight codes with up to $k = 50$ at distance $d = 8$. The non-CSS search produces perturbed codes matching the gross-code figure of merit at [[144,12,12]], along with additional high-distance candidates reported as certified values or upper bounds according to MILP status. Overall, these results show that LLM-guided program evolution can serve as a practical tool for structured quantum-code discovery when paired with independent evaluation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LLM-guided search finds some new bivariate bicycle codes with multi-stage validation, but MILP optimality gaps are a real concern for the larger n.

read the letter

The punchline for this paper is that LLM-guided program evolution can discover new bivariate bicycle quantum codes, including an indecomposable [[288,16,12]] at n=288, while recovering known ones, after screening 200k candidates across five campaigns.

The work does a good job of describing a practical pipeline that combines the evolutionary search with independent validation steps: GF(2) rank, distance estimation via MILP, Tanner graph deduplication with BLISS, decomposability analysis, and local-Clifford equivalence checks. Reporting the compute time and LLM cost adds transparency, and finding perturbed non-CSS variants that match known figures of merit shows the method can explore beyond CSS codes.

The main soft spot is the reliability of the MILP-based distance certification for the larger block lengths. As the stress-test note points out, MILP may not always close the optimality gap in reasonable time for n=288 instances, which means some reported distances could be upper bounds instead of exact values. The paper notes this distinction in the abstract, but the total count of 465 distinct codes and the claim of new high-performing examples would be stronger with more information on how many distances were fully certified versus bounded. The equivalence checks are also only as good as the implementation of the isomorphism tests.

This is a paper for the quantum error correction community focused on code construction and discovery tools. Readers interested in applying AI methods to algebraic code design would get concrete value from the workflow and the listed codes.

It deserves a serious referee because the approach is novel and the results include specific new code parameters with a described validation process.

I recommend sending it for peer review.

Referee Report

2 major / 2 minor

Summary. The paper introduces an LLM-guided evolutionary workflow for discovering bivariate-bicycle quantum LDPC codes and perturbed variants. Through approximately 1,650 iterations screening 2×10^5 candidates, it identifies 465 distinct codes at n ≤ 360, including 97 CSS codes and 368 non-CSS, with examples like an indecomposable [[288,16,12]] code and [[144,12,12]] matching gross-code merit, certified via a staged pipeline of GF(2) rank, MILP distance certification, BLISS deduplication, and equivalence checks.

Significance. If the reported parameters hold under independent verification, the work shows that LLM-guided program evolution can practically explore algebraic design spaces for quantum LDPC codes, recovering known high-performing codes while producing new finite-length representatives. The explicit use of external validation tools (MILP, BLISS isomorphism checks, local-Clifford equivalence) is a methodological strength that supports reproducibility of the numerical claims.

major comments (2)

[Validation pipeline description] Validation pipeline (description of staged checks): the claim of 465 distinct codes with exact parameters (e.g., [[288,16,12]] indecomposable, [[144,12,12]]) depends on MILP distance certification producing no false-positive exact distances. The manuscript provides no MILP formulation, solver parameters, time limits, optimality-gap statistics, or error analysis for n=288 instances, so it is impossible to confirm that reported d=12 values are certified rather than upper bounds.
[Staged validation pipeline] BLISS deduplication and equivalence checks (staged pipeline): the count of 465 distinct candidates and the novelty of the indecomposable [[288,16,12]] code rest on the correctness of Tanner-graph isomorphism tests and local-Clifford enumeration. No implementation details, test cases, or verification that all equivalences were exhaustively checked are supplied, which directly affects the reported distinctness and parameter counts.

minor comments (2)

[Abstract] Abstract: the phrase 'ans"atze' contains a stray quote character that should be corrected for typesetting.
[Results] The manuscript would benefit from an explicit table summarizing the 465 codes by CSS/non-CSS, distance certification status (exact vs. upper bound), and MILP gap status.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments on the manuscript. We address each major comment below and will incorporate the requested details into a revised version to improve reproducibility.

read point-by-point responses

Referee: [Validation pipeline description] Validation pipeline (description of staged checks): the claim of 465 distinct codes with exact parameters (e.g., [[288,16,12]] indecomposable, [[144,12,12]]) depends on MILP distance certification producing no false-positive exact distances. The manuscript provides no MILP formulation, solver parameters, time limits, optimality-gap statistics, or error analysis for n=288 instances, so it is impossible to confirm that reported d=12 values are certified rather than upper bounds.

Authors: We agree that the manuscript does not provide the MILP formulation, solver parameters, time limits, optimality-gap statistics, or error analysis. In the revised version we will add an appendix or subsection detailing the exact MILP formulation used for distance certification, the solver and its parameters, time limits applied, optimality-gap tolerances, and certification statistics for the n=288 instances (and others), making clear which distances are certified exact values versus upper bounds. revision: yes
Referee: [Staged validation pipeline] BLISS deduplication and equivalence checks (staged pipeline): the count of 465 distinct candidates and the novelty of the indecomposable [[288,16,12]] code rest on the correctness of Tanner-graph isomorphism tests and local-Clifford enumeration. No implementation details, test cases, or verification that all equivalences were exhaustively checked are supplied, which directly affects the reported distinctness and parameter counts.

Authors: We acknowledge that the manuscript supplies insufficient implementation details, test cases, and verification for the BLISS Tanner-graph isomorphism tests and local-Clifford equivalence enumeration. The revised manuscript will expand the validation-pipeline description to include the precise BLISS usage, the local-Clifford enumeration algorithm, representative test cases, and confirmation that equivalence checks were exhaustive for the reported codes, thereby supporting the distinctness counts and novelty claims. revision: yes

Circularity Check

0 steps flagged

No circularity: results from explicit computational search and external validation

full rationale

The paper describes an LLM-guided evolutionary search that mutates Python programs to generate candidate bivariate-bicycle codes, followed by a staged validation pipeline (GF(2) rank, distance estimation via MILP, BLISS deduplication, decomposability analysis, and local-Clifford checks). No step reduces a claimed result to its own inputs by definition, renames a fitted quantity as a prediction, or relies on a load-bearing self-citation whose content is unverified. The 465 codes and their parameters (e.g., [[288,16,12]]) are outputs of independent computational procedures rather than tautological redefinitions. The workflow is self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on the empirical effectiveness of LLM mutation and the soundness of the listed validation procedures; no free parameters, axioms, or invented entities are introduced in the abstract.

pith-pipeline@v0.9.1-grok · 5819 in / 1108 out tokens · 25889 ms · 2026-06-28T14:36:43.074136+00:00 · methodology

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Evolving Quantum Error-Correcting Encodings for Molecular Simulation
quant-ph 2026-06 conditional novelty 7.0

LLM-driven evolutionary program synthesis discovers Generalized Superfast Encodings with exact distance 5 (and 6 on one instance) for molecular Hamiltonians, the first beyond distance 3.
Large-Language-Model Discovery of Quantum LDPC Codes through Structured Concept Evolution
quant-ph 2026-06 unverdicted novelty 7.0

A new LLM-guided search method called structured concept evolution discovers competitive lifted-product qLDPC code families including non-abelian constructions.

Reference graph

Works this paper leans on

14 extracted references · 2 linked inside Pith · cited by 2 Pith papers

[1]

A. R. Calderbank and P. W. Shor, Good quantum error-correcting codes exist, Physical Review A54, 1098 (1996)

1996
[2]

A. M. Steane, Multiple-particle interference and quan- tum error correction, Proceedings of the Royal Society of London A452, 2551 (1996)

1996
[3]

Junttila and P

T. Junttila and P. Kaski, Engineering an efficient canon- ical labeling tool for large and sparse graphs, inProceed- ings of the 9th Workshop on Algorithm Engineering and Experiments (ALENEX)(SIAM, 2007) pp. 135–149

2007
[4]

Bravyi, A

S. Bravyi, A. W. Cross, J. M. Gambetta, D. Maslov, P. Rall, and T. J. Yoder, High-threshold and low- overhead fault-tolerant quantum memory, Nature627, 778 (2024), see also arXiv:2308.07915

arXiv 2024
[5]

J. N. Eberhardt, F. R. F. Pereira, and V. Steffan, Prun- ing qLDPC codes: Towards bivariate bicycle codes with open boundary conditions, arXiv:2412.04181 [quant-ph] (2024), preprint

arXiv 2024
[6]

Tillich and G

J.-P. Tillich and G. Z´ emor, Quantum LDPC codes with positive rate and minimum distance proportional to the square root of the blocklength, IEEE Transactions on Information Theory60, 1193 (2014), arXiv:0903.0566

Pith/arXiv arXiv 2014
[7]

Panteleev and G

P. Panteleev and G. Kalachev, Degenerate quantum LDPC codes with good finite length performance, Quan- tum5, 585 (2021), arXiv:1904.02703

arXiv 2021
[8]

Roffe, D

J. Roffe, D. R. White, S. Burton, and E. Campbell, De- coding across the quantum low-density parity-check code landscape, Physical Review Research2, 043423 (2020)

2020
[9]

B. C. B. Symons, A. Rajput, and D. E. Browne, Se- quences of bivariate bicycle codes from covering graphs, arXiv:2511.13560 [quant-ph] (2025), preprint

Pith/arXiv arXiv 2025
[10]

Liang and Y.-A

Z. Liang and Y.-A. Chen, Self-dual bivariate bicycle codes with transversal Clifford gates, arXiv:2510.05211 [quant-ph] (2025), v2, January 2026

arXiv 2025
[11]

Liang, K

Z. Liang, K. Liu, H. Song, and Y.-A. Chen, Generalized toric codes on twisted tori for quantum error correction, PRX Quantum6, 020357 (2025), arXiv:2503.03827

arXiv 2025
[12]

Lin and L

H.-K. Lin and L. P. Pryadko, Quantum two-block group algebra codes, Physical Review A109, 022407 (2024), arXiv:2306.16400

arXiv 2024
[13]

Huangfu and J

Q. Huangfu and J. A. J. Hall, Parallelizing the dual re- vised simplex method, Mathematical Programming Com- putation10, 119 (2018)

2018
[14]

Sole disc

P. Virtanen, R. Gommers, T. E. Oliphant, M. Haber- land, T. Reddy, D. Cournapeau, E. Burovski, P. Peter- son, W. Weckesser, J. Bright,et al., SciPy 1.0: Fun- damental algorithms for scientific computing in Python, Nature Methods17, 261 (2020). 9 TABLE II. All verified codes atn= 288 (98 polynomial representations, 49 distinct codes), sorted by FOM =kd 2/n...

2020

[1] [1]

A. R. Calderbank and P. W. Shor, Good quantum error-correcting codes exist, Physical Review A54, 1098 (1996)

1996

[2] [2]

A. M. Steane, Multiple-particle interference and quan- tum error correction, Proceedings of the Royal Society of London A452, 2551 (1996)

1996

[3] [3]

Junttila and P

T. Junttila and P. Kaski, Engineering an efficient canon- ical labeling tool for large and sparse graphs, inProceed- ings of the 9th Workshop on Algorithm Engineering and Experiments (ALENEX)(SIAM, 2007) pp. 135–149

2007

[4] [4]

Bravyi, A

S. Bravyi, A. W. Cross, J. M. Gambetta, D. Maslov, P. Rall, and T. J. Yoder, High-threshold and low- overhead fault-tolerant quantum memory, Nature627, 778 (2024), see also arXiv:2308.07915

arXiv 2024

[5] [5]

J. N. Eberhardt, F. R. F. Pereira, and V. Steffan, Prun- ing qLDPC codes: Towards bivariate bicycle codes with open boundary conditions, arXiv:2412.04181 [quant-ph] (2024), preprint

arXiv 2024

[6] [6]

Tillich and G

J.-P. Tillich and G. Z´ emor, Quantum LDPC codes with positive rate and minimum distance proportional to the square root of the blocklength, IEEE Transactions on Information Theory60, 1193 (2014), arXiv:0903.0566

Pith/arXiv arXiv 2014

[7] [7]

Panteleev and G

P. Panteleev and G. Kalachev, Degenerate quantum LDPC codes with good finite length performance, Quan- tum5, 585 (2021), arXiv:1904.02703

arXiv 2021

[8] [8]

Roffe, D

J. Roffe, D. R. White, S. Burton, and E. Campbell, De- coding across the quantum low-density parity-check code landscape, Physical Review Research2, 043423 (2020)

2020

[9] [9]

B. C. B. Symons, A. Rajput, and D. E. Browne, Se- quences of bivariate bicycle codes from covering graphs, arXiv:2511.13560 [quant-ph] (2025), preprint

Pith/arXiv arXiv 2025

[10] [10]

Liang and Y.-A

Z. Liang and Y.-A. Chen, Self-dual bivariate bicycle codes with transversal Clifford gates, arXiv:2510.05211 [quant-ph] (2025), v2, January 2026

arXiv 2025

[11] [11]

Liang, K

Z. Liang, K. Liu, H. Song, and Y.-A. Chen, Generalized toric codes on twisted tori for quantum error correction, PRX Quantum6, 020357 (2025), arXiv:2503.03827

arXiv 2025

[12] [12]

Lin and L

H.-K. Lin and L. P. Pryadko, Quantum two-block group algebra codes, Physical Review A109, 022407 (2024), arXiv:2306.16400

arXiv 2024

[13] [13]

Huangfu and J

Q. Huangfu and J. A. J. Hall, Parallelizing the dual re- vised simplex method, Mathematical Programming Com- putation10, 119 (2018)

2018

[14] [14]

Sole disc

P. Virtanen, R. Gommers, T. E. Oliphant, M. Haber- land, T. Reddy, D. Cournapeau, E. Burovski, P. Peter- son, W. Weckesser, J. Bright,et al., SciPy 1.0: Fun- damental algorithms for scientific computing in Python, Nature Methods17, 261 (2020). 9 TABLE II. All verified codes atn= 288 (98 polynomial representations, 49 distinct codes), sorted by FOM =kd 2/n...

2020