Machine Learning Optimal Quantum Error Correction Thresholds
Pith reviewed 2026-06-26 11:30 UTC · model grok-4.3
The pith
The coherent information sets a sharp lower bound on the binary cross-entropy loss of neural decoders that track logical operators through noisy quantum channels.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The coherent information constitutes a sharp lower bound on the achievable loss for any decoder that tracks logical operators across noisy channels, and a transformer network trained to estimate the coherent information via maximum likelihood decoding therefore yields threshold estimates that match theoretical limits while delivering lower logical error rates than minimum-weight perfect matching.
What carries the argument
Transformer-based neural network trained via maximum likelihood decoding to estimate the coherent information, which serves as the lower bound on cross-entropy loss for logical-operator-tracking decoders.
If this is right
- Threshold estimates for the surface code under code-capacity, phenomenological, and circuit-level noise match known theoretical limits.
- When deployed as a decoder the network produces lower logical error rates than minimum-weight perfect matching.
- Soft post-selection that filters on network confidence for each logical operator is optimal and reduces both logical error rate and abort probability.
- The same maximum-likelihood-coset post-selection strategy remains scalable for larger code distances.
Where Pith is reading between the lines
- The loss-bound connection could be used to certify near-optimality of any decoder whose training loss approaches the coherent information.
- The approach may generalize to other stabilizer codes once suitable training data for logical operators can be generated.
- Post-selection optimality proofs based on maximum-likelihood cosets likely extend to any decoder whose output probabilities approximate the coherent-information channel.
Load-bearing premise
The network estimates the coherent information without systematic bias arising from its architecture or the choice of training data.
What would settle it
Numerical results in which the network's predicted thresholds deviate from known theoretical values or in which its logical error rate fails to beat minimum-weight perfect matching under identical surface-code noise models would falsify the central claim.
Figures
read the original abstract
As quantum computers remain susceptible to noise, QEC is essential for preserving logical information during computations. However, the performance of QEC codes breaks down beyond certain noise thresholds, revealing fundamental limits on their ability to protect quantum information. These limits can be characterized using information-theoretic measures such as the coherent information, which quantifies the maximum rate at which logical information can be reliably transmitted through a noisy quantum channel. In this work, we establish a direct connection between the CI and the binary cross-entropy loss used when training neural network decoders. Specifically, we show that the CI constitutes a sharp lower bound on the achievable loss for decoders that track logical operators across noisy channels. To this end, we develop a transformer-based neural network model based on maximum likelihood decoding. We train this network to estimate the CI and evaluate its performance on the surface code under three noise models: code capacity, phenomenological, and circuit-level noise. Our results demonstrate that the network accurately predicts CI and yields threshold estimates that closely match known theoretical limits. When used as a decoder, the network significantly outperforms the minimum weight perfect matching decoder in terms of logical error rate. We also introduce a novel soft post-selection scheme that independently treats uncertainty in both logical operators and relies on confidence-based filtering of the network's output. We prove that such post-selection strategies, based on the MLD cosets, are optimal, and demonstrate their scalability in terms of both logical error rate and abort probability. These findings establish transformer-based architectures as powerful tools for QEC and provide the first numerical evidence supporting the optimality and scalability of MLD-based post-selection.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims a direct link between the coherent information (CI) and binary cross-entropy loss for neural-network decoders that track logical operators across noisy channels, proving that CI is a sharp lower bound on achievable loss. A transformer model trained via maximum-likelihood decoding is used to estimate CI for the surface code under code-capacity, phenomenological, and circuit-level noise; the authors report that the network accurately predicts CI, produces threshold estimates matching known theoretical limits, outperforms minimum-weight perfect matching as a decoder, and that a novel soft post-selection scheme based on MLD cosets is optimal and scalable.
Significance. If the central claims hold without systematic bias in the CI estimates, the work would provide a theoretically grounded method for using ML to estimate QEC thresholds and improve decoding performance, with the optimality proof for MLD-based post-selection constituting a clear contribution. The reported numerical agreement with independently known thresholds supplies external grounding.
major comments (1)
- [Abstract] Abstract (and results on surface-code evaluations): the claim that the network 'accurately predicts CI' and yields thresholds that 'closely match known theoretical limits' rests on the premise that the transformer trained via maximum-likelihood decoding produces unbiased CI estimates. Any architecture- or data-induced systematic offset would invalidate both the threshold-matching result and the asserted superiority over MWPM, since those conclusions are derived from the lower-bound relation alone.
Simulated Author's Rebuttal
We thank the referee for the careful reading and for identifying this key point about potential bias in the CI estimates. We address it directly below.
read point-by-point responses
-
Referee: [Abstract] Abstract (and results on surface-code evaluations): the claim that the network 'accurately predicts CI' and yields thresholds that 'closely match known theoretical limits' rests on the premise that the transformer trained via maximum-likelihood decoding produces unbiased CI estimates. Any architecture- or data-induced systematic offset would invalidate both the threshold-matching result and the asserted superiority over MWPM, since those conclusions are derived from the lower-bound relation alone.
Authors: We agree that the absence of systematic bias is essential for the validity of the threshold and decoder-performance claims. Our central theorem establishes that binary cross-entropy is bounded from below by the coherent information, with equality achieved precisely under maximum-likelihood decoding. The transformer is trained to minimize this loss, which corresponds to MLD. To verify lack of architecture- or data-induced offset, we have performed additional checks on small surface-code instances (distance 3–5) where the exact CI can be computed by enumeration; the network estimates agree with these exact values to within statistical error. In the revised manuscript we will add a dedicated validation subsection presenting these comparisons, together with an analysis of convergence to the MLD limit as a function of training data volume and model capacity. We will also revise the abstract to state that the estimates are consistent with exact results on small instances and match known thresholds within reported uncertainties, thereby making the supporting evidence explicit. revision: yes
Circularity Check
No circularity; central claims grounded by independent theoretical thresholds
full rationale
The paper derives that coherent information (CI) is a sharp lower bound on binary cross-entropy loss for decoders tracking logical operators, then trains a transformer via maximum-likelihood decoding to estimate CI on surface-code instances. Threshold estimates are validated by direct numerical agreement with independently known theoretical limits from prior literature on code-capacity, phenomenological, and circuit-level noise; this external match supplies non-circular grounding. The optimality proof for MLD-coset post-selection is self-contained and does not rely on fitted values or self-citations. No step reduces a reported prediction to its own training inputs by construction, and no load-bearing premise collapses to a self-citation chain.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
probability to appear. As 1− √c < p(λ x|s)< √c, the syndrome gets discarded. Formally, we define soft post-selection in terms of a se- lection functionfthat maps a post-selection parameterc onto a set of accepted syndromes{s 1, ..., sn} ∈ P(S). In the following,Sdenotes the set of measured stabilizers, i.e. the generators of the stabilizer group, and ther...
arXiv 2004
-
[2]
R. P. Feynman, International Journal of Theoretical Physics21, 467 (1982)
1982
-
[3]
Shor, inProceedings 35th Annual Symposium on Foundations of Computer Science(1994) pp
P. Shor, inProceedings 35th Annual Symposium on Foundations of Computer Science(1994) pp. 124–134
1994
-
[4]
L. K. Grover, inProceedings of the twenty-eighth annual ACM symposium on Theory of Computing, STOC ’96 (New York, NY, USA, 1996) pp. 212–219
1996
-
[5]
B. M. Terhal, Reviews of Modern Physics87, 307 (2015)
2015
-
[6]
M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information: 10th Anniversary Edition (2010)
2010
-
[7]
C. E. Shannon, Bell System Technical Journal27, 379 (1948)
1948
-
[8]
Iyer and D
P. Iyer and D. Poulin, IEEE Transactions on Informa- tion Theory61, 5209 (2015)
2015
-
[9]
A. d. iOlius, P. Fuentes, R. Or´ us, P. M. Crespo, and J. E. Martinez, Quantum8, 1498 (2024)
2024
-
[10]
P. W. Shor, Physical Review A52, R2493 (1995)
1995
-
[11]
Knill, R
E. Knill, R. Laflamme, and W. H. Zurek, Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences454, 365 (1998)
1998
-
[12]
A. Y. Kitaev, Annals of Physics303, 2 (2003)
2003
-
[13]
P. W. Shor, Fault-tolerant quantum computation (1997)
1997
-
[14]
Aharonov and M
D. Aharonov and M. Ben-Or, SIAM Journal on Com- puting38, 1207 (2008)
2008
-
[15]
Schumacher and M
B. Schumacher and M. A. Nielsen, Physical Review A 54, 2629 (1996)
1996
-
[16]
Dennis, A
E. Dennis, A. Kitaev, A. Landahl, and J. Preskill, Jour- nal of Mathematical Physics43, 4452 (2002)
2002
-
[17]
R. Fan, Y. Bao, E. Altman, and A. Vishwanath, PRX Quantum5, 020343 (2024)
2024
-
[18]
Colmenarez, Z.-M
L. Colmenarez, Z.-M. Huang, S. Diehl, and M. M¨ uller, Physical Review Research6, L042014 (2024)
2024
-
[19]
Colmenarez, S
L. Colmenarez, S. Kim, and M. M¨ uller, PRX Quantum 6, 040327 (2025)
2025
-
[20]
P. W. Shor, Phys. Rev. A52, R2493 (1995)
1995
-
[21]
A. Steane, Proceedings of the Royal Society A: Math- ematical, Physical and Engineering Sciences452, 2551 (1996), https://royalsocietypublishing.org/rspa/article- pdf/452/1954/2551/998878/rspa.1996.0136.pdf
arXiv 1996
-
[22]
Gottesman, Stabilizer Codes and Quantum Error Correction (1997)
D. Gottesman, Stabilizer Codes and Quantum Error Correction (1997)
1997
-
[23]
A. G. Fowler, M. Mariantoni, J. M. Martinis, and A. N. Cleland, Physical Review A86, 032324 (2012)
2012
-
[24]
Rispler, D
M. Rispler, D. Vodola, M. M¨ uller, and S. Kim, The random coupled-plaquette gauge model and the surface code under circuit-level noise (2024)
2024
-
[25]
Vodola, M
D. Vodola, M. Rispler, S. Kim, and M. M¨ uller, Quantum 6, 618 (2022)
2022
-
[26]
D. P. DiVincenzo, P. W. Shor, and J. A. Smolin, Phys- ical Review A57, 830 (1998)
1998
-
[27]
Z. Wang, Z. Wu, and Z. Wang, PRX Quantum6, 010314 (2025)
2025
-
[28]
Hauser, Y
J. Hauser, Y. Bao, S. Sang, A. Lavasani, U. Agrawal, and M. P. A. Fisher, Information dynamics in deco- hered quantum memory with repeated syndrome mea- surements: a dual approach (2024)
2024
-
[29]
Behrends and B
J. Behrends and B. B´ eri, The surface code beyond Pauli channels: Logical noise coherence, information- theoretic measures, and errorfield-double phenomenol- ogy (2025)
2025
-
[30]
Tang and Y
Y. Tang and Y. Bao, Phases of Floquet code under local decoherence (2025)
2025
-
[31]
Lyons, Understanding Stabilizer Codes Under Local Decoherence Through a General Statistical Mechanics Mapping (2024)
A. Lyons, Understanding Stabilizer Codes Under Local Decoherence Through a General Statistical Mechanics Mapping (2024)
2024
-
[32]
K. Su, Z. Yang, and C.-M. Jian, Physical Review B110, 085158 (2024)
2024
-
[33]
Eckstein, B
F. Eckstein, B. Han, S. Trebst, and G.-Y. Zhu, PRX Quantum5, 040313 (2024)
2024
-
[34]
Huang, L
Z.-M. Huang, L. Colmenarez, M. M¨ uller, and S. Diehl, Coherent information as a mixed-state topological order parameter of fermions (2024)
2024
-
[35]
J. Y. Lee, C.-M. Jian, and C. Xu, PRX Quantum4, 030317 (2023)
2023
-
[36]
Google Quantum AI and Collaborators, Nature638, 920 (2025)
2025
-
[37]
Google Quantum AI, Nature614, 676 (2023)
2023
-
[38]
Google Quantum AI, Nature595, 383 (2021). 22
2021
-
[39]
Krinner, N
S. Krinner, N. Lacroix, A. Remm, A. D. Paolo, E. Genois, C. Leroux, C. Hellings, S. Lazar, F. Swiadek, J. Herrmann, G. J. Norris, C. K. Andersen, M. M¨ uller, A. Blais, C. Eichler, and A. Wallraff, Nature605, 669 (2022)
2022
-
[40]
Y. Zhao, Y. Ye, H.-L. Huang, Y. Zhang, D. Wu, H. Guan, Q. Zhu, Z. Wei, T. He, S. Cao, F. Chen, T.- H. Chung, H. Deng, D. Fan, M. Gong, C. Guo, S. Guo, L. Han, N. Li, S. Li, Y. Li, F. Liang, J. Lin, H. Qian, H. Rong, H. Su, L. Sun, S. Wang, Y. Wu, Y. Xu, C. Ying, J. Yu, C. Zha, K. Zhang, Y.-H. Huo, C.-Y. Lu, C.-Z. Peng, X. Zhu, and J.-W. Pan, Physical Re- ...
2022
-
[41]
C. K. Andersen, A. Remm, S. Lazar, S. Krinner, N. Lacroix, G. J. Norris, M. Gabureac, C. Eichler, and A. Wallraff, Nature Physics16, 875 (2020)
2020
-
[42]
V. V. Sivak, A. Eickbusch, B. Royer, S. Singh, I. Tsiout- sios, S. Ganjam, A. Miano, B. L. Brock, A. Z. Ding, L. Frunzio, S. M. Girvin, R. J. Schoelkopf, and M. H. Devoret, Nature616, 50 (2023)
2023
-
[43]
R. S. Gupta, N. Sundaresan, T. Alexander, C. J. Wood, S. T. Merkel, M. B. Healy, M. Hillenbrand, T. Jochym- O’Connor, J. R. Wootton, T. J. Yoder, A. W. Cross, M. Takita, and B. J. Brown, Nature625, 259 (2024)
2024
-
[44]
Het´ enyi and J
B. Het´ enyi and J. R. Wootton, PRX Quantum5, 040334 (2024)
2024
-
[45]
Lacroix, A
N. Lacroix, A. Bourassa, F. J. H. Heras, L. M. Zhang, J. Bausch, A. W. Senior, T. Edlich, N. Shutty, V. Sivak, A. Bengtsson, M. McEwen, O. Higgott, D. Kafri, J. Claes, A. Morvan, Z. Chen, A. Zalcman, S. Mad- huk, R. Acharya, L. A. Beni, G. Aigeldinger, R. Al- caraz, T. I. Andersen, M. Ansmann, F. Arute, K. Arya, A. Asfaw, J. Atalaya, R. Babbush, B. Ballar...
2024
-
[46]
Postler, F
L. Postler, F. Butt, I. Pogorelov, C. D. Marciniak, S. Heußen, R. Blatt, P. Schindler, M. Rispler, M. M¨ uller, and T. Monz, PRX Quantum5, 030326 (2024)
2024
-
[47]
Ryan-Anderson, N
C. Ryan-Anderson, N. C. Brown, C. H. Baldwin, J. M. Dreiling, C. Foltz, J. P. Gaebler, T. M. Gatterman, N. Hewitt, C. Holliman, C. V. Horst, J. Johansen, D. Lucchetti, T. Mengle, M. Matheny, Y. Matsuoka, K. Mayer, M. Mills, S. A. Moses, B. Neyenhuis, J. Pino, P. Siegfried, R. P. Stutz, J. Walker, and D. Hayes, High- fidelity and Fault-tolerant Teleportati...
2024
-
[48]
L. Egan, D. M. Debroy, C. Noel, A. Risinger, D. Zhu, D. Biswas, M. Newman, M. Li, K. R. Brown, M. Cetina, and C. Monroe, Nature598, 281 (2021)
2021
-
[49]
Paetznick, M
A. Paetznick, M. P. d. Silva, C. Ryan-Anderson, J. M. Bello-Rivas, J. P. C. III, A. Chernoguzov, J. M. Dreiling, C. Foltz, F. Frachon, J. P. Gaebler, T. M. Gatterman, L. Grans-Samuelsson, D. Gresh, D. Hayes, N. Hewitt, C. Holliman, C. V. Horst, J. Johansen, D. Lucchetti, Y. Matsuoka, M. Mills, S. A. Moses, B. Neyenhuis, A. Paz, J. Pino, P. Siegfried, A. S...
2024
-
[50]
Berthusen, J
N. Berthusen, J. Dreiling, C. Foltz, J. P. Gaebler, T. M. Gatterman, D. Gresh, N. Hewitt, M. Mills, S. A. Moses, B. Neyenhuis, P. Siegfried, and D. Hayes, Physical Re- view A110, 062413 (2024)
2024
-
[51]
Pogorelov, F
I. Pogorelov, F. Butt, L. Postler, C. D. Marciniak, P. Schindler, M. M¨ uller, and T. Monz, Nature Physics 21, 298 (2025)
2025
-
[52]
Postler, S
L. Postler, S. Heußen, I. Pogorelov, M. Rispler, T. Feld- ker, M. Meth, C. D. Marciniak, R. Stricker, M. Ring- bauer, R. Blatt, P. Schindler, M. M¨ uller, and T. Monz, Nature605, 675 (2022)
2022
-
[53]
Huang, K
S. Huang, K. R. Brown, and M. Cetina, Comparing Shor and Steane Error Correction Using the Bacon-Shor Code (2023)
2023
-
[54]
Bluvstein, S
D. Bluvstein, S. J. Evered, A. A. Geim, S. H. Li, H. Zhou, T. Manovitz, S. Ebadi, M. Cain, M. Kalinowski, D. Hangleiter, J. P. Bonilla Ataides, N. Maskara, I. Cong, X. Gao, P. Sales Ro- driguez, T. Karolyshyn, G. Semeghini, M. J. Gullans, M. Greiner, V. Vuleti´ c, and M. D. Lukin, Nature626, 58 (2024)
2024
-
[55]
M. J. Bedalov, M. Blakely, P. D. Buttler, C. Carna- han, F. T. Chong, W. C. Chung, D. C. Cole, P. Goipo- ria, P. Gokhale, B. Heim, G. T. Hickman, E. B. Jones, R. A. Jones, P. Khalate, J.-S. Kim, K. W. Kuper, M. T. 23 Lichtman, S. Lee, D. Mason, N. A. Neff-Mallon, T. W. Noel, V. Omole, A. G. Radnaev, R. Rines, M. Saffman, E. Shabtai, M. H. Teo, B. Thotakur...
2024
-
[56]
B. W. Reichardt, A. Paetznick, D. Aasen, I. Basov, J. M. Bello-Rivas, P. Bonderson, R. Chao, W. v. Dam, M. B. Hastings, A. Paz, M. P. d. Silva, A. Sundaram, K. M. Svore, A. Vaschillo, Z. Wang, M. Zanner, W. B. Cairncross, C.-A. Chen, D. Crow, H. Kim, J. M. Kin- dem, J. King, M. McDonald, M. A. Norcia, A. Ryou, M. Stone, L. Wadleigh, K. Barnes, P. Battagli...
2024
-
[57]
P. S. Rodriguez, J. M. Robinson, P. N. Jepsen, Z. He, C. Duckering, C. Zhao, K.-H. Wu, J. Campo, K. Bag- nall, M. Kwon,et al., Nature645, 620 (2025)
2025
-
[58]
S. B. Bravyi and A. Y. Kitaev, Quantum codes on a lattice with boundary (1998)
1998
-
[59]
Torlai and R
G. Torlai and R. G. Melko, Physical Review Letters 119, 030501 (2017)
2017
-
[60]
Krastanov and L
S. Krastanov and L. Jiang, Scientific Reports7, 11003 (2017)
2017
-
[61]
Andreasson, J
P. Andreasson, J. Johansson, S. Liljestrand, and M. Granath, Quantum3, 183 (2019)
2019
-
[62]
Maskara, A
N. Maskara, A. Kubica, and T. Jochym-O’Connor, Physical Review A99, 052351 (2019)
2019
-
[63]
Wagner, H
T. Wagner, H. Kampermann, and D. Bruß, Physical Review A102, 042411 (2020)
2020
-
[64]
Fitzek, M
D. Fitzek, M. Eliasson, A. F. Kockum, and M. Granath, Physical Review Research2, 023230 (2020)
2020
-
[65]
Ni, Quantum4, 310 (2020)
X. Ni, Quantum4, 310 (2020)
2020
-
[66]
Meinerz, C.-Y
K. Meinerz, C.-Y. Park, and S. Trebst, Physical Review Letters128, 080505 (2022)
2022
-
[67]
R. W. J. Overwater, M. Babaie, and F. Sebastiano, IEEE Transactions on Quantum Engineering3, 1 (2022)
2022
-
[68]
E. S. Matekole, E. Ye, R. Iyer, and S. Y.-C. Chen, De- coding surface codes with deep reinforcement learning and probabilistic policy reuse (2022)
2022
-
[69]
Gicev, L
S. Gicev, L. C. L. Hollenberg, and M. Usman, Quantum 7, 1058 (2023)
2023
-
[70]
Egorov, R
E. Egorov, R. Bondesan, and M. Welling, The END: An Equivariant Neural Decoder for Quantum Error Correc- tion (2023)
2023
-
[71]
H. Cao, F. Pan, Y. Wang, and P. Zhang, qecGPT: de- coding Quantum Error-correcting Codes with Genera- tive Pre-trained Transformers (2023)
2023
-
[72]
Sweke, M
R. Sweke, M. S. Kesselring, E. P. L. v. Nieuwenburg, and J. Eisert, Machine Learning: Science and Technology2, 025005 (2021)
2021
-
[73]
Choukroun and L
Y. Choukroun and L. Wolf, inProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelli- gence and Thirty-Sixth Conference on Innovative Appli- cations of Artificial Intelligence and Fourteenth Sympo- sium on Educational Advances in Artificial Intelligence, AAAI’24/IAAI’24/EAAI’24, Vol. 38 (2024) pp. 64–72
2024
-
[74]
H. Wang, P. Liu, K. Shao, D. Li, J. Gu, D. Z. Pan, Y. Ding, and S. Han, Transformer-QEC: Quantum Er- ror Correction Code Decoding with Transferable Trans- formers (2023)
2023
-
[75]
Bordoni and S
S. Bordoni and S. Giagu, Quantum Information Pro- cessing22, 151 (2023)
2023
-
[76]
Baireuther, M
P. Baireuther, M. D. Caio, B. Criger, C. W. J. Beenakker, and T. E. O’Brien, New Journal of Physics 21, 013003 (2019)
2019
-
[77]
Baireuther, T
P. Baireuther, T. E. O’Brien, B. Tarasinski, and C. W. J. Beenakker, Quantum2, 48 (2018)
2018
-
[78]
Chamberland and P
C. Chamberland and P. Ronagh, Quantum Science and Technology3, 044002 (2018)
2018
-
[79]
Varsamopoulos, B
S. Varsamopoulos, B. Criger, and K. Bertels, Quantum Science and Technology3, 015004 (2017)
2017
-
[80]
Zhang, X
M. Zhang, X. Ren, G. Xi, Z. Zhang, Q. Yu, F. Liu, H. Zhang, S. Zhang, and Y.-C. Zheng, A Scalable, Fast and Programmable Neural Decoder for Fault-Tolerant Quantum Computation Using Surface Codes (2023)
2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.