pith. machine review for the scientific record. sign in

arxiv: 2605.03563 · v1 · submitted 2026-05-05 · 🌌 astro-ph.IM

Recognition: unknown

Iterative Poisson Solvers for Self-gravity with the GPU Code Astaroth

Authors on Pith no claims yet

Pith reviewed 2026-05-07 12:51 UTC · model grok-4.3

classification 🌌 astro-ph.IM
keywords Poisson solversself-gravityGPU computingiterative methodsastrophysical simulationsmultigridconjugate gradienthydrodynamics coupling
0
0 comments X

The pith

Iterative Poisson solvers on GPUs solve self-gravity efficiently in astrophysical simulations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops and benchmarks several iterative methods to solve the Poisson equation for gravitational potential using GPUs. These include conjugate gradient, successive over-relaxation, multigrid for Cartesian grids, and biconjugate gradient stabilized for spherical coordinates. The solvers are validated for accuracy against analytic solutions and then integrated with hydrodynamics to model time-dependent star formation on 3D grids. They demonstrate convergence rates and timings that match the performance of other algorithms in the platform, establishing a practical approach for including self-gravity in large-scale simulations.

Core claim

The development of GPU-based iterative Poisson solvers featuring novel combinations of discretizations and smoothers allows for accurate and efficient computation of self-gravity. When coupled to hydrodynamics, these solvers simulate classic star formation problems with performance similar to existing methods, providing a foundation for production-scale astrophysical simulations on structured grids.

What carries the argument

The iterative Poisson equation solvers, including conjugate gradient, successive over-relaxation, multigrid, and biconjugate gradient stabilized methods, with specific discretizations and smoothers optimized for GPU execution to compute gravitational potentials.

If this is right

  • The solvers achieve accuracy comparable to known analytic solutions for gravitational potentials.
  • Convergence and per-iteration timings are measured and found competitive with other computational routines.
  • Integration with hydrodynamics enables simulation of time-dependent self-gravitating systems like star formation.
  • Performance supports extension to production-scale simulations on three-dimensional structured grids.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • These methods could allow GPU codes to handle self-gravity in problems with high density contrasts without prohibitive costs.
  • Extension to spherical coordinates broadens applicability to centrally symmetric astrophysical objects such as stars or planets.
  • Further optimization might involve adaptive time-stepping to maintain efficiency in dynamic simulations.

Load-bearing premise

The benchmarked convergence rates and timings will hold when the solvers are used in full production simulations with realistic density variations, boundary conditions, and coupled time-stepping.

What would settle it

A full-scale simulation run where the self-gravity solver's contribution to total runtime significantly exceeds that of the hydrodynamics or where the gravitational potential deviates measurably from expected values in a test case.

Figures

Figures reproduced from arXiv: 2605.03563 by 3), (3) Faculty of Information Technology, (4) Department of Astrophysics, Aalto University, Academia Sinica, American Museum of Natural History, Astrophysics, Department of Computer Science, Electrical Engineering, Finland, Hsien Shang (1), Maarit Korpi-Lagg (2) ((1) Institute of Astronomy, Matthias Rheinhardt (2), Miikka S. V\"ais\"al\"a (1, Mordecai-Mark Mac Low (4), New York, NY, Ruben Krasnopolsky (1), Taipei, Taiwan (2) High-Performance Computing Lab, Touko Puro (2), University of Oulu, USA), Wei-Wen Li (1).

Figure 1
Figure 1. Figure 1: Computed potential Φ and analytical potential Φ traced along the ¯ x and z axes for the problem stated in §3.2. From top to bottom rows, results for CG 2nd order, CG 6th order, SOR 2nd order, and MG 6th order. Column 1: x-axis, resolution 2563 . Column 2: z-axis, (2563 ). Columns 3: x-axis (5123 ). Column 4: z-axis (5123 ). On the bottom row (MG results), the grid sizes of Columns 1–4 are 2553 , 2553 , 511… view at source ↗
Figure 2
Figure 2. Figure 2: Relative errors of the solutions shown in view at source ↗
Figure 3
Figure 3. Figure 3: Relative error of the converged solutions, shown as colormaps of the logarithm of the error between the analytical solution Φ and the converged numerical solution Φ, computed as log ¯ 10(|(Φ¯ − Φ)/ max (|Φ|)|) in §3.2. From left to right: CG 2nd order, CG 6th order, SOR 2nd order, and MG 6th order. Top row: resolution 2563 ; bottom row: 5123 (2553 and 5113 , respectively, for MG) view at source ↗
Figure 4
Figure 4. Figure 4: Convergence of the RMS relative errors and residuals, as the number of iterations increases.. From top to bottom rows: RMS relative numerical error δ, RMS relative analytical error ϵ, and the RMS relative residuals. From left to right columns: CG 2nd order, CG 6th order, SOR 2nd order, and MG 6th order. Within each panel, the line colors indicate the grid resolution (64: red; 128: cyan; 256: blue; 512: bla… view at source ↗
Figure 5
Figure 5. Figure 5: Collapse of an isothermal sphere. Dimensionless infall velocity −vr/cs vs. dimensionless radial coordinate x = r/(cst) at t = 1.5 × 104 yr. Left: full scale; right: zoom into the region where the analytical SIS solution (blue solid line) features a discontinuous angle at x = 1, numerically very demanding for both Astaroth (red solid line) and ZEUS (magenta dashes). We have also run ZEUS using an initial ma… view at source ↗
read the original abstract

We present the development and benchmarking of Poisson solvers for graphics processing units (GPUs). Implemented in the Astaroth platform, the solvers feature high computational efficiency. We present novel combinations of discretizations and smoothers and document practical and performance-focused implementations aimed at reducing time-to-solution for self-gravitating systems. We describe the solver architectures and validate their accuracy against known analytic solutions. We measure convergence and timing per iteration for various solver algorithms, including conjugate gradient, successive overrelaxation, and multigrid in Cartesian coordinates, along with biconjugate gradient stabilized in spherical coordinates. We also couple the solvers to the Astaroth hydrodynamics to simulate a classic time-dependent problem in star formation, measuring accuracy and time-to-solution, for self-gravity on three-dimensional structured grids. Our results demonstrate that the solvers achieve performance similar to other algorithms implemented in Astaroth, and provide a solid foundation for integration into production-scale astrophysical simulations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents the implementation and benchmarking of iterative Poisson solvers (including conjugate gradient, successive overrelaxation, multigrid in Cartesian coordinates, and biconjugate gradient stabilized in spherical coordinates) for self-gravity within the GPU-based Astaroth code. It describes novel combinations of discretizations and smoothers, validates accuracy against analytic solutions, reports convergence rates and per-iteration timings, and demonstrates coupling to the Astaroth hydrodynamics module on a single time-dependent 3D star-formation test problem using structured grids. The authors conclude that the solvers achieve performance comparable to other Astaroth algorithms and provide a solid foundation for integration into production-scale astrophysical simulations.

Significance. If the reported accuracy, convergence, and timing results hold under broader conditions, the work supplies practical GPU implementations of Poisson solvers that could improve efficiency in self-gravitating hydrodynamics codes. The validation against analytic solutions and the direct coupling to an existing hydrodynamics framework are strengths that support incremental progress in the field. However, the significance for production-scale use is limited by the narrow scope of the presented tests.

major comments (2)
  1. [coupling to hydrodynamics and results sections] The central claim that the solvers 'provide a solid foundation for integration into production-scale astrophysical simulations' (abstract and concluding section) is load-bearing on the representativeness of the reported convergence rates, iteration counts, and relative timings. These are measured only on a single classic star-formation test case with moderate density contrasts on 3-D structured grids; no benchmarks are shown for regimes with density contrasts orders of magnitude higher, varied boundary conditions, or the constraints of adaptive time-stepping and operator splitting with the hydrodynamics module. If iteration counts or performance degrade under those conditions, the evidence for the claim is insufficient.
  2. [validation and benchmarking sections] The validation section reports accuracy against analytic solutions and timing per iteration for the various solvers, but does not provide quantitative error norms (e.g., L2 or L∞ norms), grid resolutions used in the convergence studies, or details on any post-hoc tuning of relaxation parameters. Without these, it is difficult to assess whether the claimed performance similarity to other Astaroth algorithms is robust or specific to the tested setups.
minor comments (2)
  1. [abstract] The abstract states that the solvers 'achieve performance similar to other algorithms implemented in Astaroth' but does not specify which algorithms or provide quantitative comparison metrics in the summary; this should be clarified with a brief reference to the relevant table or figure.
  2. [solver architectures section] Notation for the spherical-coordinate biconjugate gradient stabilized solver could be made more explicit when first introduced, particularly regarding the handling of coordinate singularities.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed report. We address each major comment below and will revise the manuscript to incorporate the suggested improvements.

read point-by-point responses
  1. Referee: [coupling to hydrodynamics and results sections] The central claim that the solvers 'provide a solid foundation for integration into production-scale astrophysical simulations' (abstract and concluding section) is load-bearing on the representativeness of the reported convergence rates, iteration counts, and relative timings. These are measured only on a single classic star-formation test case with moderate density contrasts on 3-D structured grids; no benchmarks are shown for regimes with density contrasts orders of magnitude higher, varied boundary conditions, or the constraints of adaptive time-stepping and operator splitting with the hydrodynamics module. If iteration counts or performance degrade under those conditions, the evidence for the claim is insufficient.

    Authors: We agree that the presented results are based on a single representative test case. The classic star-formation problem is a standard benchmark that exercises the coupled hydrodynamics-self-gravity system on structured grids. Nevertheless, we acknowledge the limited scope and will revise the abstract and concluding sections to moderate the claim, explicitly noting that the foundation is demonstrated for the tested conditions. We will also add a brief discussion of potential limitations for higher-contrast regimes and adaptive time-stepping, along with plans for future work. revision: yes

  2. Referee: [validation and benchmarking sections] The validation section reports accuracy against analytic solutions and timing per iteration for the various solvers, but does not provide quantitative error norms (e.g., L2 or L∞ norms), grid resolutions used in the convergence studies, or details on any post-hoc tuning of relaxation parameters. Without these, it is difficult to assess whether the claimed performance similarity to other Astaroth algorithms is robust or specific to the tested setups.

    Authors: We thank the referee for highlighting these omissions. In the revised manuscript we will add L2 and L∞ error norms for the analytic validation tests, specify the grid resolutions employed in the convergence studies, and document the relaxation parameters together with any tuning procedures used. These additions will make the performance comparisons more quantitative and reproducible. revision: yes

Circularity Check

0 steps flagged

No circularity in implementation, validation, or performance claims.

full rationale

The manuscript presents GPU implementations of Poisson solvers (CG, SOR, multigrid, BiCGSTAB) with novel discretization-smoother combinations, validates pointwise accuracy and convergence rates directly against external analytic solutions, and reports measured iteration counts and wall-clock timings on a single 3-D star-formation test problem. No derived quantity is obtained by fitting a parameter to a subset of the same data and then relabeling the fit as a prediction; no uniqueness theorem or ansatz is imported via self-citation to force the choice of method; and the central performance claim rests on direct benchmarking rather than on any self-referential reduction. The work is therefore self-contained as an engineering and measurement contribution.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper rests on standard numerical analysis results for iterative solvers and finite-difference discretizations of the Poisson equation; no new free parameters, ad-hoc axioms, or invented physical entities are introduced.

axioms (2)
  • standard math Convergence properties of conjugate gradient, SOR, multigrid, and BiCGSTAB on structured grids
    Invoked when reporting iteration counts and timing per iteration.
  • domain assumption Equivalence of the discrete Poisson operator to the continuous gravitational potential on Cartesian and spherical meshes
    Underlying the validation against analytic solutions.

pith-pipeline@v0.9.0 · 5592 in / 1073 out tokens · 50565 ms · 2026-05-07T12:51:36.181408+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

41 extracted references · 28 canonical work pages

  1. [1]

    2020, The Journal of Open Source Software, 5, 2513, doi: 10.21105/joss.02513

    Almgren, A., Sazo, M., Bell, J., et al. 2020, The Journal of Open Source Software, 5, 2513, doi: 10.21105/joss.02513

  2. [2]

    S., Bell, J

    Almgren, A. S., Bell, J. B., Lijewski, M. J., Luki´ c, Z., & Van Andel, E. 2013, ApJ, 765, 39, doi: 10.1088/0004-637X/765/1/39 AMD. 2026, rocALUTION documentation, https: //rocm.docs.amd.com/projects/rocALUTION/en/latest/

  3. [3]

    D., McInnes, L

    Balay, S., Gropp, W. D., McInnes, L. C., & Smith, B. F. 1997, in Modern Software Tools in Scientific Computing, ed. E. Arge, A. M. Bruaset, & H. P. Langtangen (Birkh¨ auser Press), 163–202

  4. [4]

    Balay, S

    Balay, S., Abhyankar, S., Adams, M. F., et al. 2026a, PETSc/TAO Users Manual, Tech. Rep. ANL-21/39 - Revision 3.25, Argonne National Laboratory, doi: 10.2172/2998643 —. 2026b, PETSc Web page, https://petsc.org/. https://petsc.org/

  5. [5]

    A., Hu, J

    Berger-Vergiat, L., Glusa, C. A., Hu, J. J., et al. 2019a, MueLu User’s Guide, Tech. Rep. SAND2019-0537, Sandia National Laboratories —. 2019b, MueLu multigrid framework, http://trilinos.org/packages/muelu Br¨ oker, O., & Grote, M. J. 2002, Applied numerical mathematics, 41, 61

  6. [6]

    Brown, J., He, Y., MacLachlan, S., Menickelly, M., & Wild, S. M. 2021, SIAM Journal on Scientific Computing, 43, A109

  7. [7]

    V., Lewis, A

    Burke, J. V., Lewis, A. S., & Overton, M. L. 2005, SIAM Journal on Optimization, 15, 751

  8. [8]

    2015, ZEUS-3D USER MANUAL, Version 3.6, https://www.ap.smu.ca/∼dclarke/zeus3d/version3.6/ documents/dzeus36 man.pdf

    Clarke, D. 2015, ZEUS-3D USER MANUAL, Version 3.6, https://www.ap.smu.ca/∼dclarke/zeus3d/version3.6/ documents/dzeus36 man.pdf

  9. [9]

    Carter Edwards and Christian R

    Edwards, H. C., Trott, C. R., & Sunderland, D. 2014, Journal of Parallel and Distributed Computing, 74, 3202 , doi: https://doi.org/10.1016/j.jpdc.2014.07.003

  10. [10]

    D., Li, R., Sj¨ ogreen, B., Wang, L., & Yang, U

    Falgout, R. D., Li, R., Sj¨ ogreen, B., Wang, L., & Yang, U. M. 2021, Parallel Computing, 108, 102840, doi: https://doi.org/10.1016/j.parco.2021.102840

  11. [11]

    A., Mac Low, M.-M., Korpi-Lagg, M

    Gent, F. A., Mac Low, M.-M., Korpi-Lagg, M. J., Puro, T., & Rheinhardt, M. 2026, ApJL, 1000, L40, doi: 10.3847/2041-8213/ae4c41

  12. [12]

    2020, Physica Scripta, 95, 055221

    Ghaffar, F., Ullah, S., & Badshah, N. 2020, Physica Scripta, 95, 055221

  13. [13]

    M., Kouatchou, J., & Zhang, J

    Gupta, M. M., Kouatchou, J., & Zhang, J. 1997, Journal of Computational Physics, 132, 123

  14. [14]

    2023, Linear Algebra and its Applications, 656, 304

    He, Y., Liu, J., & Wang, X.-S. 2023, Linear Algebra and its Applications, 656, 304

  15. [15]

    Hunter, J. D. 2007, Computing in Science & Engineering, 9, 90, doi: 10.1109/MCSE.2007.55

  16. [16]

    James, R. A. 1977, Journal of Computational Physics, 25, 71, doi: 10.1016/0021-9991(77)90013-4

  17. [17]

    P., Ranka, S., & Fox, G

    Koester, D. P., Ranka, S., & Fox, G. C. 1994, in Supercomputing’94: Proceedings of the 1994 ACM/IEEE Conference on Supercomputing, IEEE, 184–193

  18. [18]

    2012, ApJ, 757, 77, doi: 10.1088/0004-637X/757/1/77

    Krasnopolsky, R., Li, Z.-Y., Shang, H., & Zhao, B. 2012, ApJ, 757, 77, doi: 10.1088/0004-637X/757/1/77

  19. [19]

    2021, ApJS, 252, 14, doi: 10.3847/1538-4365/abca97

    Krasnopolsky, R., Ponce Mart´ ınez, M., Shang, H., Tseng, Y.-H., & Yen, C.-C. 2021, ApJS, 252, 14, doi: 10.3847/1538-4365/abca97

  20. [20]

    Lesur, G. R. J., Baghdadi, S., Wafflard-Fernandez, G., et al. 2023, A&A, 677, A9, doi: 10.1051/0004-6361/202346005

  21. [21]

    Moon, S., Kim, W.-T., & Ostriker, E. C. 2019, ApJS, 241, 24, doi: 10.3847/1538-4365/ab09e9

  22. [22]

    2015, SIAM Journal on Scientific Computing, 37, S602, doi: 10.1137/140980260

    Naumov, M., Arsaev, M., Castonguay, P., et al. 2015, SIAM Journal on Scientific Computing, 37, S602, doi: 10.1137/140980260

  23. [23]

    2020, arXiv preprint arXiv:2008.00925

    Nwankwo, C., & Dai, W. 2020, arXiv preprint arXiv:2008.00925

  24. [24]

    L., & Vogel, A

    Oo, K. L., & Vogel, A. 2020, arXiv preprint arXiv:2007.07539 18Krasnopolsky et al. Pekkil¨ a, J. 2019, Master’s thesis, Aalto University, Espoo, Finland. https://urn.fi/URN:NBN:fi:aalto-201906233993 Pekkil¨ a, J., V¨ ais¨ al¨ a, M. S., K¨ apyl¨ a, M. J., Rheinhardt, M., & Lappi, O. 2022, Parallel Computing, 111, 102904, doi: 10.1016/J.PARCO.2022.102904 Pe...

  25. [25]

    Flannery, B. P. 2002, Numerical recipes in C++ : the art of scientific computing (Cambridge U. Press)

  26. [26]

    SIAM, 2 edition, 2003

    Saad, Y. 2003, Iterative methods for sparse linear systems (Philadelphia: SIAM), doi: 10.1137/1.9780898718003

  27. [27]

    A., Goldbaum, N

    Schive, H.-Y., ZuHone, J. A., Goldbaum, N. J., et al. 2018, Monthly Notices of the Royal Astronomical Society, 481, 4815

  28. [28]

    Shu, F. H. 1977, ApJ, 214, 488, doi: 10.1086/155274

  29. [29]

    F., & Carey, G

    Spotz, W. F., & Carey, G. F. 1996, Numerical Methods for Partial Differential Equations, 12, 235, doi: 10.1002/(SICI)1098-2426(199603)12:2⟨235:: AID-NUM6⟩3.0.CO;2-R

  30. [30]

    Tamstorf, R., Benzaken, J., & McCormick, S. F. 2021, SIAM Journal on Scientific Computing, 43, S420 The MathWorks Inc. 2024, MATLAB, MathWorks, Natick, Massachusetts

  31. [31]

    Tomida, K., & Stone, J. M. 2023, ApJS, 266, 7, doi: 10.3847/1538-4365/acc2c0

  32. [32]

    R., Lebrun-Grandi´ e, D., Arndt, D., et al

    Trott, C. R., Lebrun-Grandi´ e, D., Arndt, D., et al. 2022, IEEE Transactions on Parallel and Distributed Systems, 33, 805, doi: 10.1109/TPDS.2021.3097283

  33. [33]

    W., & Schuller, A

    Trottenberg, U., Oosterlee, C. W., & Schuller, A. 2000, Multigrid (Academic Press)

  34. [34]

    2025, arXiv preprint arXiv:2511.04566

    Vacek, P., Anzt, H., Carson, E., et al. 2025, arXiv preprint arXiv:2511.04566

  35. [35]

    Vacek, P., Carson, E., & Soodhalter, K. M. 2024, SIAM Journal on Scientific Computing, 46, A2634 V¨ ais¨ al¨ a, M. S., Pekkil¨ a, J., K¨ apyl¨ a, M. J., et al. 2021, ApJ, 907, 83, doi: 10.3847/1538-4357/abceca V¨ ais¨ al¨ a, M. S., Shang, H., Galli, D., Lizano, S., &

  36. [36]

    2023, ApJ, 959, 32, doi: 10.3847/1538-4357/acfb00 Van der Vorst, H

    Krasnopolsky, R. 2023, ApJ, 959, 32, doi: 10.3847/1538-4357/acfb00 Van der Vorst, H. A. 1992, SIAM Journal on scientific and Statistical Computing, 13, 631

  37. [37]

    I., McKevitt, J., Kulikov, I., & Elbakyan, V

    Vorobyov, E. I., McKevitt, J., Kulikov, I., & Elbakyan, V. 2023, A&A, 671, A81, doi: 10.1051/0004-6361/202244701

  38. [38]

    2020, ApJS, 247, 2, doi: 10.3847/1538-4365/ab66ba

    Wang, H.-H., & Yen, C.-C. 2020, ApJS, 247, 2, doi: 10.3847/1538-4365/ab66ba

  39. [39]

    2025, ApJ, 987, 123, doi: 10.3847/1538-4357/add88a

    Wang, Y.-C., Shang, H., & Krasnopolsky, R. 2025, ApJ, 987, 123, doi: 10.3847/1538-4357/add88a

  40. [40]

    D., & Krumholz, M

    Wibking, B. D., & Krumholz, M. R. 2022, MNRAS, 512, 1430, doi: 10.1093/mnras/stac439

  41. [41]

    AMReX: a framework for block-structured adaptive mesh refinement

    Zhang, W., Almgren, A., Beckner, V., et al. 2019, Journal of Open Source Software, 4, 1370, doi: 10.21105/joss.01370