Towards Universal Convergence of Backward Error in Linear System Solvers
Pith reviewed 2026-05-10 08:06 UTC · model grok-4.3
The pith
Richardson iteration achieves at most 1/k relative backward error after k steps on any positive semidefinite linear system, independent of condition number.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For any positive semidefinite matrix A and vector b, the k-th Richardson iterate x_k satisfies that the relative backward error is at most 1/k: there exist perturbations Delta A and Delta b with ||Delta A||/||A|| + ||Delta b||/||b|| <= 1/k such that (A + Delta A)x_k = b + Delta b. This bound is derived directly from the residual contraction properties of the iteration on PSD matrices and does not involve the condition number. The same framework yields an O(1/k^{2}) rate when the iterate is chosen to minimize backward error over the Krylov subspace, which is realized by the MINBERR algorithm.
What carries the argument
The residual contraction of Richardson iteration on PSD matrices that directly controls the minimal perturbation size making the current approximate solution exact.
Load-bearing premise
The matrix must be positive semidefinite for the unconditional 1/k backward error bound to hold.
What would settle it
Compute the relative backward error of the k-th Richardson iterate on a positive semidefinite matrix whose condition number exceeds k; if that error ever exceeds 1/k for any such matrix, the claim is false.
Figures
read the original abstract
The quest for an algorithm that solves an $n\times n$ linear system in $O(n^2)$ time complexity, or $O(n^2 \text{poly}(1/\epsilon))$ when solving up to $\epsilon$ relative error, is a long-standing open problem in numerical linear algebra and theoretical computer science. There are two predominant paradigms for measuring relative error: forward error (i.e., distance from the output to the optimum solution) and backward error (i.e., distance to the nearest problem solved by the output). In most prior studies, convergence of iterative linear system solvers is measured via various notions of forward error, and as a result, depends heavily on the conditioning of the input. Yet, the numerical analysis literature has long advocated for backward error as the more practically relevant notion of approximation. In this work, we show that -- surprisingly -- the classical and simple Richardson iteration incurs at most $1/k$ (relative) backward error after $k$ iterations on any positive semidefinite (PSD) linear system, irrespective of its condition number. This universal convergence rate implies an $O(n^2/\epsilon)$ complexity algorithm for solving a PSD linear system to $\epsilon$ backward error, and we establish similar or better complexity when using a variety of Krylov solvers beyond Richardson. Then, by directly minimizing backward error over a Krylov subspace, we attain an even faster $O(1/k^2)$ universal rate, and we turn this into an efficient algorithm, MINBERR, with complexity $O(n^2/\sqrt\epsilon)$. Finally, we extend this approach via normal equations to solving general linear systems in $O(n^2\log(n)/\epsilon)$ time complexity. We report strong numerical performance of our algorithms on benchmark problems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that the classical Richardson iteration achieves at most 1/k relative backward error after k iterations on any positive semidefinite linear system, independent of condition number. This yields an O(n²/ε) algorithm for ε-backward-error solutions. Similar or better rates are established for other Krylov solvers; a new MINBERR algorithm that directly minimizes backward error over the Krylov subspace attains an O(1/k²) universal rate and O(n²/√ε) complexity. For general (non-PSD) systems the normal-equations approach is shown empirically to give O(1/k) convergence, with strong numerical performance reported on benchmarks.
Significance. If the central derivation holds, the result is significant: it supplies the first explicit conditioning-independent convergence guarantees in the backward-error metric, which the numerical-analysis literature has long preferred over forward error. The argument rests on the spectral radius of the Richardson operator being at most 1 for PSD matrices and on the standard normwise backward-error formula, yielding a clean, parameter-free 1/k bound. The construction of MINBERR and the empirical extension to general matrices via normal equations add practical value and suggest new solver design principles focused on backward-error minimization.
minor comments (3)
- The complexity statements (O(n²/ε) and O(n²/√ε)) should explicitly state the underlying arithmetic model and whether they count matrix-vector products or full matrix operations.
- The precise definition of relative backward error η = ||b − A x|| / (||A|| ||x|| + ||b||) is used in the abstract and introduction; a short dedicated paragraph or equation early in the paper would aid readers.
- The section reporting numerical experiments would benefit from a brief statement of the benchmark matrices, stopping criteria, and any data-exclusion rules applied.
Simulated Author's Rebuttal
We are grateful to the referee for their supportive review and recommendation for minor revision. The referee's summary correctly captures the key contributions of our work on universal convergence rates for backward error in linear system solvers.
read point-by-point responses
-
Referee: The paper claims that the classical Richardson iteration achieves at most 1/k relative backward error after k iterations on any positive semidefinite linear system, independent of condition number. This yields an O(n²/ε) algorithm for ε-backward-error solutions. Similar or better rates are established for other Krylov solvers; a new MINBERR algorithm that directly minimizes backward error over the Krylov subspace attains an O(1/k²) universal rate and O(n²/√ε) complexity. For general (non-PSD) systems the normal-equations approach is shown empirically to give O(1/k) convergence, with strong numerical performance reported on benchmarks.
Authors: We thank the referee for this accurate encapsulation of our results. The claims are as stated in the manuscript, and we stand by the derivations and empirical findings presented. No changes are required. revision: no
Circularity Check
No significant circularity identified
full rationale
The claimed 1/k backward-error bound for Richardson iteration on PSD systems is derived directly from the spectral radius of the iteration matrix (at most 1 when the step size is 1/||A||_2) and the standard normwise backward-error formula. For eigenvalues in [0, lambda_max], the residual component stays O(1) while the iterate component grows linearly in k, producing eta <= 1/k independent of condition number. The PSD assumption is explicitly required and stated; the derivation does not reduce to any fitted parameter, self-definition, or self-citation chain. The extension to general matrices is presented only as empirical observation, not a proven claim. No load-bearing ansatz, uniqueness theorem, or renaming of known results appears in the provided text. The result is self-contained against the mathematical properties of PSD matrices and Krylov subspaces.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The input matrix is positive semidefinite
Forward citations
Cited by 1 Pith paper
-
Well-Conditioned Oblivious Perturbations in Linear Space
An O(n)-randomness perturbation combining a dense deterministic pattern matrix with a non-uniform sparse dependent perturbation reduces condition numbers to O(n) for any input matrix.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.