Fault Tolerance of Accelerated Asynchronous Fixed-Point Iterations on Flexible Computing Infrastructure
Pith reviewed 2026-06-29 09:50 UTC · model grok-4.3
The pith
Anderson acceleration survives asynchrony only when staleness perturbs the fixed-point evaluation rather than corrupting iterates directly.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Asynchronous execution provides wall-clock speedups of 2.9x for Jacobi, 7.7x for value iteration, and 16.9x for SCF independent of acceleration. Anderson acceleration fails categorically under iterate-level corruption but retains its benefits under evaluation-level perturbation, making the staleness mechanism the primary determinant of whether acceleration survives asynchronous execution.
What carries the argument
The distinction between iterate-level corruption, where stale worker returns directly overwrite portions of the accelerated iterate, and evaluation-level perturbation, where staleness acts as a bounded perturbation to the fixed-point map evaluation.
If this is right
- Straggler tolerance holds universally, delivering the reported speedups with a 100 ms delayed worker in every tested method.
- Anderson acceleration loses all benefit under iterate-level corruption but keeps its improvement under evaluation-level perturbation.
- The location where staleness enters the computation overrides contraction norm or smoothness as the deciding factor for accelerator survival.
- The observed behavior matches the perturbation analysis cited from Toth et al. (2017).
Where Pith is reading between the lines
- Solvers intended for asynchronous use should route updates to prevent direct overwrite of accelerated iterates when acceleration is applied.
- The staleness-mechanism distinction could be tested on additional fixed-point problems to check whether it holds beyond the three methods studied.
- Distributed execution frameworks could detect the type of staleness and conditionally enable or disable acceleration.
Load-bearing premise
That the difference between iterate-level corruption and evaluation-level perturbation observed in these three methods is the main driver of acceleration behavior and that the pattern generalizes beyond the tested cases and Ray framework.
What would settle it
Modify the block Jacobi implementation so that staleness enters only as an evaluation perturbation rather than direct iterate overwrite, then check whether Anderson acceleration regains its convergence benefit under asynchrony.
Figures
read the original abstract
Asynchronous iterative methods tolerate straggling processors by allowing workers to proceed with stale data, but at a cost: the iterates become inconsistent, potentially degrading convergence. We investigate whether convergence accelerators such as Anderson acceleration compensate for this degradation. We experimentally study three fixed-point iterations: the Jacobi method for sparse linear systems, value iteration for the Bellman equation, and the Hartree--Fock self-consistent field (SCF) iteration. The experiments are conducted using a high-performance execution framework Ray, which abstracts the complexity of distributed systems and enables code parallelization and fault injection with minimal changes. We establish two main results. First, straggler tolerance is universal: asynchronous execution provides wall-clock speedups of $2.9\times$ (Jacobi), $7.7\times$ (VI), and $16.9\times$ (SCF) over synchronous execution with a 100\,ms-delayed worker, independent of whether acceleration is used. Second, Anderson acceleration's effectiveness under asynchrony depends on where staleness enters the computation. We identify two staleness mechanisms: iterate-level corruption, where stale worker returns directly overwrite portions of the accelerated iterate (as in block Jacobi), and evaluation-level perturbation, where staleness acts as a bounded perturbation to the fixed-point map evaluation (as in VI and SCF). Anderson acceleration fails categorically under the first mechanism but retains its benefits under the second, consistent with the perturbation analysis of Toth et al.\ (2017). This distinction, rather than the contraction norm or smoothness of the map, is the primary determinant of whether acceleration survives asynchronous execution.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper experimentally studies Anderson acceleration for asynchronous fixed-point iterations on three maps (block Jacobi for linear systems, value iteration for Bellman equations, and Hartree-Fock SCF) using the Ray framework. It reports that asynchrony yields wall-clock speedups of 2.9× (Jacobi), 7.7× (VI), and 16.9× (SCF) independent of acceleration, and that Anderson acceleration fails categorically under iterate-level corruption but retains benefits under evaluation-level perturbation; the authors conclude that this staleness-mechanism distinction, rather than contraction norm or smoothness, is the primary determinant of whether acceleration survives asynchrony, consistent with Toth et al. (2017) perturbation analysis.
Significance. If the mechanism distinction is confirmed, the results supply practical guidance on when convergence accelerators remain useful in distributed asynchronous settings with stragglers. The choice of Ray for fault injection and parallelization supports reproducibility, and the direct experimental link to existing perturbation theory is a constructive contribution.
major comments (1)
- [Abstract and experimental results] Abstract and experimental results: the claim that the iterate-level vs. evaluation-level staleness distinction 'rather than the contraction norm or smoothness of the map, is the primary determinant' is not supported by the design. The three maps differ simultaneously in staleness mechanism, contraction properties, and smoothness; no controlled variation holds the mechanism fixed while changing the norm or Lipschitz constant, leaving the observed categorical failure in Jacobi versus retention in VI/SCF open to confounding by map properties.
minor comments (1)
- The abstract and results sections report experimental outcomes but provide no details on number of runs, statistical tests, error bars, or controls for confounding factors, which limits verifiability of the speedup and failure-mode claims.
Simulated Author's Rebuttal
We thank the referee for the careful reading and the constructive critique of our experimental design and claims. We respond to the major comment below.
read point-by-point responses
-
Referee: [Abstract and experimental results] Abstract and experimental results: the claim that the iterate-level vs. evaluation-level staleness distinction 'rather than the contraction norm or smoothness of the map, is the primary determinant' is not supported by the design. The three maps differ simultaneously in staleness mechanism, contraction properties, and smoothness; no controlled variation holds the mechanism fixed while changing the norm or Lipschitz constant, leaving the observed categorical failure in Jacobi versus retention in VI/SCF open to confounding by map properties.
Authors: We agree that the experimental design does not provide a controlled isolation of the staleness mechanism while varying contraction norm or smoothness; the three maps were selected as representative applications rather than as a factorial design. The observed categorical difference (failure under iterate-level corruption in Jacobi, retention under evaluation-level perturbation in VI and SCF) is consistent with the perturbation analysis of Toth et al. (2017), but confounding by map properties cannot be excluded on the basis of these experiments alone. In the revised version we will (i) replace the phrasing 'is the primary determinant' in the abstract with 'is consistent with the mechanism distinction being the primary determinant according to existing perturbation theory' and (ii) add a limitations paragraph in the discussion section that explicitly notes the lack of controlled variation and calls for future work that holds the map fixed while varying only the injection point of staleness. revision: yes
Circularity Check
No circularity; empirical results are self-contained observations
full rationale
The paper reports direct experimental measurements of wall-clock speedups and convergence behavior under injected delays in the Ray framework for three distinct fixed-point maps. Claims about staleness mechanisms (iterate-level vs. evaluation-level) and Anderson acceleration survival are presented as outcomes of those controlled runs rather than quantities derived from equations or parameters fitted inside the paper. The single external citation to Toth et al. (2017) supplies an independent perturbation analysis and does not form a self-citation chain. No load-bearing step reduces by construction to a fitted input, self-definition, or renamed known result.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Fixed-point iterations are expected to converge when the underlying map satisfies appropriate contraction or smoothness conditions.
Forward citations
Cited by 1 Pith paper
-
Residual-Weighted Randomized Jacobi: Sharpened Bounds via Residual Concentration and Asynchronous Extension
Residual-weighted randomized Jacobi with ℓ=2 selection sharpens one-step convergence by exactly the inverse participation ratio ν² of the residual and extends the analysis to asynchronous execution where the same quan...
Reference graph
Works this paper leans on
-
[1]
Donald G. Anderson. 1965. Iterative Procedures for Nonlinear Integral Equations.J. ACM12, 4 (1965), 547–560. doi:10.1145/321296.321305
-
[2]
Hartwig Anzt, Jack Dongarra, and Enrique S. Quintana-Ortí. 2019. Fine-grained bit-flip protection for relaxation methods.Journal of Computational Science36 (2019). doi:10.1016/j.jocs.2016.11.013 14
-
[3]
Archibald, Ken I.M
Thomas W. Archibald, Ken I.M. McKinnon, and Lyn C. Thomas. 1995.On the Generation of Markov Decision Processes. Technical Report. University of Edinburgh. Journal of the Operational Research Society, 46(3):354–361
1995
-
[4]
Haim Avron, Alex Druinsky, and Anshul Gupta. 2015. Revisiting Asynchronous Linear Solvers: Provable Convergence Rate Through Randomization. J. ACM62, 6 (2015), 51:1–51:27. doi:10.1145/2814566
-
[5]
Banerjee, Phanish Suryanarayana, and John E
Amartya S. Banerjee, Phanish Suryanarayana, and John E. Pask. 2016. Periodic Pulay Method for Robust and Efficient Convergence Acceleration of Self-Consistent Field Iterations.Chemical Physics Letters647 (2016), 31–35. doi:10.1016/j.cplett.2016.01.033
-
[6]
Gérard M. Baudet. 1978. Asynchronous Iterative Methods for Multiprocessors.J. ACM25, 2 (1978), 226–244. doi:10.1145/322063.322067
-
[7]
Bertsekas and John N
Dimitri P. Bertsekas and John N. Tsitsiklis. 1989.Parallel and Distributed Computation: Numerical Methods. Prentice Hall
1989
-
[8]
Bertsekas and John N
Dimitri P. Bertsekas and John N. Tsitsiklis. 1996.Neuro-Dynamic Programming. Athena Scientific, Belmont, MA
1996
-
[9]
Daniel Chazan and Willard Miranker. 1969. Chaotic Relaxation.Linear Algebra Appl.2, 2 (1969), 199–222. doi:10.1016/0024-3795(69)90028-7
-
[10]
Zizhong Chen. 2013. Online-ABFT: An online algorithm based fault tolerance scheme for soft error detection in iterative methods. InProc. 18th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPoPP). 167–176. doi:10.1145/2442516.2442533
-
[11]
Edmond Chow, Hartwig Anzt, and Jack Dongarra. 2015. Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs. InHigh Performance Computing (ISC High Performance 2015) (Lecture Notes in Computer Science, Vol. 9137). Springer, 1–16. doi:10.1007/978-3-319-20119-1_1
-
[12]
Edmond Chow, Andreas Frommer, and Daniel B. Szyld. 2021. Asynchronous Richardson Iterations: Theory and Practice.Numerical Algorithms87 (2021), 1635–1651. doi:10.1007/s11075-020-01023-3
-
[13]
Evan Coleman, Erik J Jensen, and Masha Sosonkina. 2021. Fault recovery methods for asynchronous linear solvers.International Journal of Parallel Programming49, 1 (2021), 51–80
2021
-
[14]
Evan Coleman and Masha Sosonkina. 2018. Self-stabilizing fine-grained parallel incomplete LU factorization.Sustainable Computing: Informatics and Systems19 (2018), 291–304
2018
-
[15]
Caleb Evans, Sara Pollock, Leo G. Rebholz, and Mengying Xiao. 2020. A Proof That Anderson Acceleration Improves the Convergence Rate in Linearly Converging Fixed-Point Methods (But Not in Those Converging Quadratically).SIAM J. Numer. Anal.58, 1 (2020), 788–810. doi:10.1137/19M1245384
-
[16]
Haw-ren Fang and Yousef Saad. 2009. Two Classes of Multisecant Methods for Nonlinear Acceleration.Numerical Linear Algebra with Applications 16, 3 (2009), 197–221. doi:10.1002/nla.617
-
[17]
Andreas Frommer and Daniel B. Szyld. 2000. On Asynchronous Iterations.J. Comput. Appl. Math.123, 1–2 (2000), 201–216. doi:10.1016/S0377- 0427(00)00409-X
-
[18]
Matthieu Geist and Bruno Scherrer. 2018. Anderson Acceleration for Reinforcement Learning. InEuropean Workshop on Reinforcement Learning (EWRL). arXiv:1809.09501
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[19]
Erik J Jensen, Evan Coleman, and Masha Sosonkina. 2018. Using modeling to improve the performance of asynchronous jacobi. InProceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA). The Steering Committee of The World Congress in Computer Science, Computer . . . , 117–126
2018
-
[20]
Erik J Jensen, Evan Coleman, and Masha Sosonkina. 2019. Predictive modeling of the performance of asynchronous iterative methods: EJ Jensen et al.The Journal of Supercomputing75, 8 (2019), 5084–5105
2019
-
[21]
Jordan, and Ion Stoica
Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I. Jordan, and Ion Stoica. 2018. Ray: A Distributed Framework for Emerging AI Applications. In13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, 561–577
2018
-
[22]
Kimio Ohno. 1964. Some Remarks on the Pariser–Parr–Pople Method.Theoretica Chimica Acta2, 3 (1964), 219–227. doi:10.1007/BF00528281
-
[23]
Rudolph Pariser and Robert G. Parr. 1953. A Semi-Empirical Theory of the Electronic Spectra and Electronic Structure of Complex Unsaturated Molecules. I.The Journal of Chemical Physics21, 3 (1953), 466–471. doi:10.1063/1.1698929
-
[24]
John A. Pople. 1953. Electron Interaction in Unsaturated Hydrocarbons.Transactions of the Faraday Society49 (1953), 1375–1385. doi:10.1039/ TF9534901375
1953
-
[25]
Péter Pulay. 1980. Convergence Acceleration of Iterative Sequences. The Case of SCF Iteration.Chemical Physics Letters73, 2 (1980), 393–398. doi:10.1016/0009-2614(80)80396-4
-
[26]
Thorsten Rohwedder and Reinhold Schneider. 2011. An Analysis for the DIIS Acceleration Method Used in Quantum Chemistry Calculations. Journal of Mathematical Chemistry49, 9 (2011), 1889–1914. doi:10.1007/s10910-011-9863-y
-
[27]
Piyush Sao and Richard Vuduc. 2013. Self-stabilizing iterative solvers. InProc. Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA). 1–8. doi:10.1145/2530268.2530272
-
[28]
Wenjie Shi, Shiji Song, Hui Wu, Ya-Chien Hsu, Cheng Wu, and Gao Huang. 2019. Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning.Advances in Neural Information Processing Systems32 (2019)
2019
-
[29]
Ke Sun, Yafei Li, Yingnan Chen, Yingbin Lin, Hongyuan Zha, and Baoyuan Wu. 2021. Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization. InAdvances in Neural Information Processing Systems, Vol. 34
2021
-
[30]
Qiming Sun, Timothy C. Berkelbach, Nick S. Blunt, George H. Booth, Sheng Guo, Zhendong Li, Junzi Liu, James D. McClain, Elvira R. Sayfutyarova, Sandeep Sharma, Sebastian Wouters, and Garnet Kin-Lic Chan. 2018. PySCF: The Python-Based Simulations of Chemistry Framework.WIREs Computational Molecular Science8, 1 (2018), e1340. doi:10.1002/wcms.1340
-
[31]
Attila Szabo and Neil S. Ostlund. 1996.Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory(revised ed.). Dover Publications. 15
1996
-
[32]
Austin Ellis, Thomas Evans, Steven Hamilton, C
Alex Toth, J. Austin Ellis, Thomas Evans, Steven Hamilton, C. T. Kelley, Roger Pawlowski, and Stuart Slattery. 2017. Local Improvement Results for Anderson Acceleration with Inaccurate Function Evaluations.SIAM Journal on Scientific Computing39, 5 (2017), S47–S65. doi:10.1137/16M1080677
-
[33]
Alex Toth and C. T. Kelley. 2015. Convergence Analysis for Anderson Acceleration.SIAM J. Numer. Anal.53, 2 (2015), 805–819. doi:10.1137/130919398
-
[34]
Walker and Peng Ni
Homer F. Walker and Peng Ni. 2011. Anderson Acceleration for Fixed-Point Iterations.SIAM J. Numer. Anal.49, 4 (2011), 1715–1735. doi:10.1137/ 10078356X
2011
-
[35]
Junzi Zhang, Brendan O’Donoghue, and Stephen Boyd. 2020. Globally Convergent Type-I Anderson Acceleration for Non-Smooth Fixed-Point Iterations.SIAM Journal on Optimization30, 4 (2020), 3170–3197. doi:10.1137/18M1232772 16
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.