arxiv: 2605.10566 · v1 · submitted 2026-05-11 · 📊 stat.ML · cs.LG· cs.NA· math.NA

Recognition: no theorem link

Affine Tracing: A New Paradigm for Probabilistic Linear Solvers

Disha Hegde, Jon Cockayne, Marvin Pf\"ortner

Pith reviewed 2026-05-12 04:58 UTC · model grok-4.3

classification 📊 stat.ML cs.LGcs.NAmath.NA

keywords probabilistic linear solversaffine probabilistic iterative methodsBayesian inferenceaffine tracingcalibrationmultigrid methodsGaussian process approximation

0 comments

The pith

Bayesian probabilistic linear solvers are special cases of non-stationary affine probabilistic iterative methods, and all realistic affine PIMs are calibrated.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper dismantles the usual split between two families of probabilistic linear solvers. Bayesian methods, which update a prior using projections of the linear system, turn out to be exactly the non-stationary affine subset of probabilistic iterative methods. In addition, any affine PIM that could arise in practice produces uncertainty estimates that are calibrated, meaning the reported probabilities match the actual frequency of errors. This unification removes the need to choose between the two approaches and instead focuses effort on building and simplifying affine probabilistic versions of existing iterative solvers. The authors supply an automatic tool, affine tracing, that converts an ordinary implementation of an affine iterative method into its probabilistic counterpart by tracking symbolic computations and simplifying the resulting graph.

Core claim

We show that Bayesian PLSs are a special case of non-stationary affine PIMs. We further prove that any realistic affine PIM is calibrated. These facts motivate a focus on non-stationary affine PIMs. To make them practical we introduce affine tracing, which automatically constructs a PIM from a standard implementation of an affine iterative method by passing symbolic tracers through the computation to build an affine computational graph, transforming the graph to compute posterior covariances, and applying equality saturation to simplify the graph for chosen priors.

What carries the argument

Affine tracing: an algorithmic procedure that injects symbolic tracers into a standard affine iterative solver to produce an affine computational graph, from which posterior means and covariances are extracted after algebraic simplification.

If this is right

Bayesian probabilistic linear solvers can be re-implemented and analyzed as instances of non-stationary affine PIMs.
Any realistic affine PIM automatically delivers calibrated uncertainty estimates for the solution of linear systems.
Affine tracing removes most manual derivation work when turning a classical affine iterative method into a probabilistic one.
Probabilistic multigrid solvers can be generated automatically and applied to tasks such as Gaussian-process approximation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same tracing technique could be applied to other affine-structured computations such as certain optimization or time-stepping methods.
Calibration guarantees might carry over to linear systems that arise inside larger probabilistic models if the affine property is preserved.
Automated generation of probabilistic solvers could let practitioners test many different iterative schemes on the same problem with little extra coding.

Load-bearing premise

The underlying iterative methods must stay affine once the chosen priors are imposed, and equality saturation must simplify the symbolic graphs correctly and efficiently for priors that arise in practice.

What would settle it

An explicit Bayesian PLS that cannot be rewritten as a non-stationary affine PIM, or a simple linear system on which an affine PIM produces uncertainty intervals whose coverage frequency deviates from the claimed probability.

Figures

Figures reproduced from arXiv: 2605.10566 by Disha Hegde, Jon Cockayne, Marvin Pf\"ortner.

read the original abstract

Probabilistic linear solvers (PLSs) return probability distributions that quantify uncertainty due to limited computation in the solution of linear systems. The literature has traditionally distinguished between Bayesian PLSs, which condition a prior on information obtained from projections of the linear system, and probabilistic iterative methods (PIMs), which lift classical iterative solvers to probability space. In this work we show this dichotomy to be false: Bayesian PLSs are a special case of non-stationary affine PIMs. In addition, we prove that any realistic affine PIM is calibrated. These results motivate a focus on (non-stationary) affine PIMs, but their practical adoption has been limited by the significant manual effort required to implement them. To address this, we introduce affine tracing, an algorithmic framework that automatically constructs a PIM from a standard implementation of an affine iterative method by passing symbolic tracers through the computation to build an affine computational graph. We show how this graph can be transformed to compute posterior covariances, and how equality saturation can be used to perform algebraic simplifications required for computation under specific prior choices. We demonstrate the framework by automatically generating a probabilistic multigrid solver and evaluate its performance in the context of Gaussian process approximation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper unifies Bayesian PLSs as non-stationary affine PIMs and supplies a working symbolic-tracing tool to generate them automatically from ordinary code.

read the letter

The central result is that the split between Bayesian probabilistic linear solvers and probabilistic iterative methods is not fundamental. Bayesian PLSs turn out to be the non-stationary special case of affine PIMs, and any realistic affine PIM is calibrated. That removes a false dichotomy and focuses attention on the affine setting where both views coincide. The second contribution is affine tracing: pass symbolic tracers through a standard affine iterative solver to build an affine computational graph, then transform the graph to obtain posterior covariances. Equality saturation is used to simplify the resulting expressions for concrete prior choices. They apply the method to produce a probabilistic multigrid solver and test it on Gaussian-process approximation tasks. The unification is clean and the automation removes the manual derivation burden that has slowed adoption. The multigrid example shows the pipeline can be made to run on a non-trivial solver. The proofs appear to rest on explicit definitions rather than fitted parameters, and the stress-test confirms no hidden circularity in the embedding or calibration arguments. The main limitation is that the empirical section is still narrow—one solver family and one downstream task—so the practical cost of the symbolic graph construction and saturation steps is not yet quantified across a range of problem sizes. Readers will also want to see how sensitive the simplifications are to prior choice. The work is aimed at people already working in probabilistic numerics or uncertainty-aware scientific computing. Anyone who has tried to hand-derive a probabilistic version of an iterative method will see immediate value in the tracing framework. The paper is coherent on its own terms and the claims are falsifiable, so it deserves a serious referee. I would send it to review.

Referee Report

3 major / 2 minor

Summary. The paper claims that the traditional distinction between Bayesian probabilistic linear solvers (PLSs) and probabilistic iterative methods (PIMs) is false, as Bayesian PLSs form a special case of non-stationary affine PIMs. It further proves that any realistic affine PIM is calibrated. To facilitate practical use, the authors introduce affine tracing, which automatically constructs an affine computational graph from a standard implementation of an affine iterative method via symbolic tracers, applies transformations to compute posterior covariances, and uses equality saturation for algebraic simplifications under specific priors. The framework is demonstrated by generating a probabilistic multigrid solver and evaluating it for Gaussian process approximation.

Significance. If the embedding construction and calibration result hold, the work would unify two strands of probabilistic numerics and supply a general calibration guarantee, which is a substantive theoretical contribution. The affine-tracing framework addresses a genuine implementation barrier by automating graph construction and simplification, and the multigrid demonstration provides a concrete, reproducible example of scaling to larger problems. Explicit use of symbolic methods and equality saturation for correctness-preserving transformations is a strength that could be cited as a positive example of automated reasoning in the field.

major comments (3)

[§3.2] §3.2 (Embedding construction): The claim that Bayesian PLSs are recovered exactly as the non-stationary special case of affine PIMs requires that the underlying iterative methods remain affine under the chosen priors; the manuscript states this as an assumption but does not supply the precise conditions on the prior or the iteration operator that guarantee affinity, which is load-bearing for the 'special case' result.
[§4.1] §4.1 (Calibration theorem): The statement that 'any realistic affine PIM is calibrated' is central, yet the definition of 'realistic' is introduced only informally and excludes certain cases without an explicit, checkable characterization; without this, it is unclear whether the calibration proof applies to the priors used in the multigrid demonstration.
[§5.3] §5.3 (Equality saturation): The claim that equality saturation reliably simplifies the symbolic graphs without losing correctness or efficiency is used to justify practical computation, but no formal invariant or termination argument is given for the rewrite rules under the specific prior choices; this affects the reproducibility of the posterior covariance computation.

minor comments (2)

[§2.3] The notation for the affine computational graph (introduced in §2.3) is used before its full definition; a forward reference or early diagram would improve readability.
[Figure 3] Figure 3 (multigrid performance) lacks error bars or multiple random seeds; adding these would make the empirical comparison with standard GP methods clearer.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback, which has helped identify opportunities to strengthen the clarity and rigor of the manuscript. We address each major comment point by point below, indicating the revisions we will incorporate.

read point-by-point responses

Referee: [§3.2] §3.2 (Embedding construction): The claim that Bayesian PLSs are recovered exactly as the non-stationary special case of affine PIMs requires that the underlying iterative methods remain affine under the chosen priors; the manuscript states this as an assumption but does not supply the precise conditions on the prior or the iteration operator that guarantee affinity, which is load-bearing for the 'special case' result.

Authors: We thank the referee for this observation. The manuscript indeed treats affinity preservation as an assumption in the embedding construction. In the revised version, we will add a new lemma in §3.2 that states the precise conditions: the prior must be Gaussian with covariance operator commuting with the iteration's linear part, and the iteration operator must be affine (i.e., of the form x ↦ Ax + b with A linear). This lemma will explicitly recover the Bayesian PLS posterior as the non-stationary affine PIM case, thereby making the special-case result fully rigorous. revision: yes
Referee: [§4.1] §4.1 (Calibration theorem): The statement that 'any realistic affine PIM is calibrated' is central, yet the definition of 'realistic' is introduced only informally and excludes certain cases without an explicit, checkable characterization; without this, it is unclear whether the calibration proof applies to the priors used in the multigrid demonstration.

Authors: We agree that the informal phrasing of 'realistic' leaves room for ambiguity. We will replace the informal definition in §4.1 with a formal one: an affine PIM is realistic if its covariance update is exactly the Bayesian conditioning step (i.e., the posterior covariance is the Schur complement of the joint covariance under the linear observation model) and no extraneous information is injected. We will also add a short verification paragraph confirming that the Gaussian priors employed in the multigrid demonstration satisfy this definition, thereby ensuring the calibration theorem applies directly. revision: yes
Referee: [§5.3] §5.3 (Equality saturation): The claim that equality saturation reliably simplifies the symbolic graphs without losing correctness or efficiency is used to justify practical computation, but no formal invariant or termination argument is given for the rewrite rules under the specific prior choices; this affects the reproducibility of the posterior covariance computation.

Authors: The referee is right that a formal termination proof for equality saturation is absent. Correctness is preserved because all rewrite rules are equivalences (they maintain the affine structure and the value of the expression), but general termination depends on the e-graph size and is not guaranteed for arbitrary priors. In the revision we will (i) state the preserved invariants explicitly (affine form and equivalence of the represented function), (ii) list the complete set of rewrite rules used for the Gaussian-process priors, and (iii) report the saturation statistics (number of iterations and final e-graph size) for the multigrid example. These additions will make the posterior-covariance computation reproducible while acknowledging that a general termination theorem lies outside the paper's scope. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's central claims rest on explicit definitions of affine PIMs, an embedding construction showing Bayesian PLSs as the non-stationary special case, and a separate calibration proof for realistic affine PIMs. These steps are presented as independent mathematical arguments in the full text, supported by stated assumptions and constructions that do not reduce to self-definition, fitted parameters renamed as predictions, or load-bearing self-citations. The affine tracing framework is introduced as a downstream implementation tool rather than a load-bearing part of the theoretical equivalence or calibration results. No patterns of self-definitional loops, ansatz smuggling, or renaming known results appear in the derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

Review based solely on abstract; full details on assumptions unavailable. The work rests on standard linear algebra and probabilistic numerics assumptions plus the new affine tracing construction.

axioms (2)

domain assumption Iterative methods for linear systems can be represented as affine operations
Central to defining affine PIMs and the tracing framework in the abstract.
domain assumption Priors on the solution exist such that calibration holds for realistic affine PIMs
Required for the stated proof that any realistic affine PIM is calibrated.

invented entities (1)

Affine computational graph no independent evidence
purpose: Symbolic representation built by tracers to enable posterior covariance computation and algebraic simplification
Core new construct introduced by the affine tracing framework.

pith-pipeline@v0.9.0 · 5519 in / 1408 out tokens · 71162 ms · 2026-05-12T04:58:49.109487+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages

[1]

Simon Bartels, Jon Cockayne, Ilse C. F. Ipsen, and Philipp Hennig. Probabilistic linear solvers: a unifyingview. Statistics and Computing, 29 0 (6): 0 1249--1263, November 2019. ISSN 1573-1375. doi:10.1007/s11222-019-09897-7

work page doi:10.1007/s11222-019-09897-7 2019
[2]

Bernstein

Dennis S. Bernstein. Matrix mathematics: theory, facts, and formulas. Princeton University Press, Princeton, N.J, 2nd ed edition, 2009. ISBN 978-0-691-13287-7 978-0-691-14039-1

work page 2009
[3]

Oates, Ilse C

Jon Cockayne, Chris J. Oates, Ilse C. F. Ipsen, and Mark Girolami. A Bayesian Conjugate Gradient Method (with Discussion ). Bayesian Analysis, 14 0 (3): 0 937--1012, September 2019. ISSN 1936-0975, 1931-6690. doi:10.1214/19-BA1145. Publisher: International Society for Bayesian Analysis

work page doi:10.1214/19-ba1145 2019
[4]

Ipsen, Chris J

Jon Cockayne, Ilse C.F. Ipsen, Chris J. Oates, and Tim W. Reid. Probabilistic iterative methods for linear systems. The Journal of Machine Learning Research, 22 0 (1): 0 232:10505--232:10538, January 2021. ISSN 1532-4435

work page 2021
[5]

Graham, Chris J

Jon Cockayne, Matthew M. Graham, Chris J. Oates, T. J. Sullivan, and Onur Teymur. Testing Whether a Learning Procedure is Calibrated . Journal of Machine Learning Research, 23 0 (203): 0 1--36, 2022. ISSN 1533-7928

work page 2022
[6]

Optimization Under Unknown Constraints

Howard Elman, David Silvester, and Andy Wathen. Finite Elements and Fast Iterative Solvers: with Applications in Incompressible Fluid Dynamics. Oxford University PressOxford, 2014. ISBN 9780191780745. doi:10.1093/acprof:oso/9780199678792.001.0001

work page doi:10.1093/acprof:oso/9780199678792.001.0001 2014
[7]

Golub and Charles F

Gene H. Golub and Charles F. Van Loan. Matrix computations. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press, Baltimore, MD, fourth edition, 2013. ISBN 978-1-4214-0794-4; 1-4214-0794-9; 978-1-4214-0859-0

work page 2013
[8]

Calibrated Computation - Aware Gaussian Processes

Disha Hegde, Mohamed Adil, and Jon Cockayne. Calibrated Computation - Aware Gaussian Processes . In Proceedings of The 28th International Conference on Artificial Intelligence and Statistics , pages 2098--2106. PMLR, April 2025. ISSN: 2640-3498

work page 2098
[9]

Probabilistic interpretation of linear solvers

Philipp Hennig. Probabilistic interpretation of linear solvers. SIAM Journal on Optimization, 25 0 (1): 0 234–260, January 2015. ISSN 1095-7189. doi:10.1137/140955501

work page doi:10.1137/140955501 2015
[10]

Osborne, and Hans P

Philipp Hennig, Michael A. Osborne, and Hans P. Kersting. Probabilistic Numerics : Computation as Machine Learning . Cambridge University Press, Cambridge, 2022. ISBN 978-1-107-16344-7. doi:10.1017/9781316681411

work page doi:10.1017/9781316681411 2022
[11]

Hersbach, B

H. Hersbach, B. Bell, P. Berrisford, G. Biavati, A. Horányi, J. Muñoz Sabater, J. Nicolas, C. Peubey, R. Radu, I. Rozum, D. Schepers, A. Simmons, C. Soci, D. Dee, and J-N Thépaut. ERA5 hourly data on single levels from 1940 to present, 2023. Accessed on 2026-05-04. Licence permits free use for any lawful purpose under CC-BY

work page 1940
[12]

Online conformal probabilistic numerics via adaptive edge-cloud offloading

Qiushuo Hou, Sangwoo Park, Matteo Zecchin, Yunlong Cai, Guanding Yu, and Simeone Osvaldo. Online conformal probabilistic numerics via adaptive edge-cloud offloading. In First International Conference on Probabilistic Numerics, 2025

work page 2025
[13]

Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian processes for machine learning. Adaptive computation and machine learning. MIT Press, Cambridge, Mass, 2006. ISBN 978-0-262-18253-9

work page 2006
[14]

Reid, Ilse C

Tim W. Reid, Ilse C. F. Ipsen, Jon Cockayne, and Chris J. Oates. Statistical properties of BayesCG under the Krylov prior. Numerische Mathematik, 155 0 (3): 0 239--288, December 2023. ISSN 0945-3245. doi:10.1007/s00211-023-01375-7

work page doi:10.1007/s00211-023-01375-7 2023
[15]

Egglog python: A pythonic library for e-graphs, 2023

Saul Shanabrook. Egglog python: A pythonic library for e-graphs, 2023. URL https://arxiv.org/abs/2305.04311. Presented at EGRAPHS@PLDI 2023

work page arXiv 2023
[16]

Robust and computation-aware gaussian processes

Marshal Sinaga, Julien Martinelli, and Samuel Kaski. Robust and computation-aware gaussian processes. In Conference on Neural Information Processing Systems, 2025

work page 2025
[17]

A. M. Stuart. Inverse problems: A bayesian perspective. Acta Numerica, 19: 0 451–559, May 2010. ISSN 1474-0508. doi:10.1017/s0962492910000061

work page doi:10.1017/s0962492910000061 2010
[18]

Equality saturation: A new approach to optimization

Ross Tate, Michael Stepp, Zachary Tatlock, and Sorin Lerner. Equality saturation: A new approach to optimization. Logical Methods in Computer Science, Volume 7, Issue 1, March 2011. ISSN 1860-5974. doi:10.2168/lmcs-7(1:10)2011

work page doi:10.2168/lmcs-7(1:10)2011 2011
[19]

Accelerating Generalized Linear Models by Trading off Computation for Uncertainty , October 2023

Lukas Tatzel, Jonathan Wenger, Frank Schneider, and Philipp Hennig. Accelerating Generalized Linear Models by Trading off Computation for Uncertainty , October 2023. URL http://arxiv.org/abs/2310.20285

work page arXiv 2023
[20]

Trottenberg, C

U. Trottenberg, C. W. Oosterlee, and Anton Schüller. Multigrid. Academic Press, San Diego, 2001. ISBN 978-0-12-701070-0

work page 2001
[21]

Randomised Postiterations for Calibrated BayesCG

Niall Vyas, Disha Hegde, and Jon Cockayne. Randomised Postiterations for Calibrated BayesCG . In Proceedings of the First International Conference on Probabilistic Numerics . PMLR, August 2025

work page 2025
[22]

Cunningham

Jonathan Wenger, Geoff Pleiss, Marvin Pförtner, Philipp Hennig, and John P. Cunningham. Posterior and Computational Uncertainty in Gaussian Processes . Advances in Neural Information Processing Systems, 35: 0 10876--10890, December 2022

work page 2022
[23]

Gardner, Geoff Pleiss, and John P

Jonathan Wenger, Kaiwen Wu, Philipp Hennig, Jacob R. Gardner, Geoff Pleiss, and John P. Cunningham. Computation- Aware Gaussian Processes : Model Selection And Linear - Time Inference . Advances in Neural Information Processing Systems, 37: 0 31316--31349, December 2024. doi:10.52202/079017-0984

work page doi:10.52202/079017-0984 2024
[24]

egg: Fast and extensible equality saturation

Max Willsey, Chandrakana Nandi, Yisu Remy Wang, Oliver Flatt, Zachary Tatlock, and Pavel Panchekha. egg: Fast and extensible equality saturation. Proceedings of the ACM on Programming Languages, 5 0 (POPL): 0 1–29, January 2021. ISSN 2475-1421. doi:10.1145/3434304

work page doi:10.1145/3434304 2021
[25]

Functional Analysis

K\^osaku Yosida. Functional Analysis. Springer Berlin Heidelberg, 1995. ISBN 9783642618598. doi:10.1007/978-3-642-61859-8

work page doi:10.1007/978-3-642-61859-8 1995
[26]

David M. Young. Iterative solution of large linear systems. Dover Publications, Inc., Mineola, NY, 2003. ISBN 0-486-42548-7. Unabridged republication of the 1971 edition [Academic Press, New York-London, MR 305568]

work page 2003