Recognition: unknown
Iterative Poisson Solvers for Self-gravity with the GPU Code Astaroth
Pith reviewed 2026-05-07 12:51 UTC · model grok-4.3
The pith
Iterative Poisson solvers on GPUs solve self-gravity efficiently in astrophysical simulations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The development of GPU-based iterative Poisson solvers featuring novel combinations of discretizations and smoothers allows for accurate and efficient computation of self-gravity. When coupled to hydrodynamics, these solvers simulate classic star formation problems with performance similar to existing methods, providing a foundation for production-scale astrophysical simulations on structured grids.
What carries the argument
The iterative Poisson equation solvers, including conjugate gradient, successive over-relaxation, multigrid, and biconjugate gradient stabilized methods, with specific discretizations and smoothers optimized for GPU execution to compute gravitational potentials.
If this is right
- The solvers achieve accuracy comparable to known analytic solutions for gravitational potentials.
- Convergence and per-iteration timings are measured and found competitive with other computational routines.
- Integration with hydrodynamics enables simulation of time-dependent self-gravitating systems like star formation.
- Performance supports extension to production-scale simulations on three-dimensional structured grids.
Where Pith is reading between the lines
- These methods could allow GPU codes to handle self-gravity in problems with high density contrasts without prohibitive costs.
- Extension to spherical coordinates broadens applicability to centrally symmetric astrophysical objects such as stars or planets.
- Further optimization might involve adaptive time-stepping to maintain efficiency in dynamic simulations.
Load-bearing premise
The benchmarked convergence rates and timings will hold when the solvers are used in full production simulations with realistic density variations, boundary conditions, and coupled time-stepping.
What would settle it
A full-scale simulation run where the self-gravity solver's contribution to total runtime significantly exceeds that of the hydrodynamics or where the gravitational potential deviates measurably from expected values in a test case.
Figures
read the original abstract
We present the development and benchmarking of Poisson solvers for graphics processing units (GPUs). Implemented in the Astaroth platform, the solvers feature high computational efficiency. We present novel combinations of discretizations and smoothers and document practical and performance-focused implementations aimed at reducing time-to-solution for self-gravitating systems. We describe the solver architectures and validate their accuracy against known analytic solutions. We measure convergence and timing per iteration for various solver algorithms, including conjugate gradient, successive overrelaxation, and multigrid in Cartesian coordinates, along with biconjugate gradient stabilized in spherical coordinates. We also couple the solvers to the Astaroth hydrodynamics to simulate a classic time-dependent problem in star formation, measuring accuracy and time-to-solution, for self-gravity on three-dimensional structured grids. Our results demonstrate that the solvers achieve performance similar to other algorithms implemented in Astaroth, and provide a solid foundation for integration into production-scale astrophysical simulations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents the implementation and benchmarking of iterative Poisson solvers (including conjugate gradient, successive overrelaxation, multigrid in Cartesian coordinates, and biconjugate gradient stabilized in spherical coordinates) for self-gravity within the GPU-based Astaroth code. It describes novel combinations of discretizations and smoothers, validates accuracy against analytic solutions, reports convergence rates and per-iteration timings, and demonstrates coupling to the Astaroth hydrodynamics module on a single time-dependent 3D star-formation test problem using structured grids. The authors conclude that the solvers achieve performance comparable to other Astaroth algorithms and provide a solid foundation for integration into production-scale astrophysical simulations.
Significance. If the reported accuracy, convergence, and timing results hold under broader conditions, the work supplies practical GPU implementations of Poisson solvers that could improve efficiency in self-gravitating hydrodynamics codes. The validation against analytic solutions and the direct coupling to an existing hydrodynamics framework are strengths that support incremental progress in the field. However, the significance for production-scale use is limited by the narrow scope of the presented tests.
major comments (2)
- [coupling to hydrodynamics and results sections] The central claim that the solvers 'provide a solid foundation for integration into production-scale astrophysical simulations' (abstract and concluding section) is load-bearing on the representativeness of the reported convergence rates, iteration counts, and relative timings. These are measured only on a single classic star-formation test case with moderate density contrasts on 3-D structured grids; no benchmarks are shown for regimes with density contrasts orders of magnitude higher, varied boundary conditions, or the constraints of adaptive time-stepping and operator splitting with the hydrodynamics module. If iteration counts or performance degrade under those conditions, the evidence for the claim is insufficient.
- [validation and benchmarking sections] The validation section reports accuracy against analytic solutions and timing per iteration for the various solvers, but does not provide quantitative error norms (e.g., L2 or L∞ norms), grid resolutions used in the convergence studies, or details on any post-hoc tuning of relaxation parameters. Without these, it is difficult to assess whether the claimed performance similarity to other Astaroth algorithms is robust or specific to the tested setups.
minor comments (2)
- [abstract] The abstract states that the solvers 'achieve performance similar to other algorithms implemented in Astaroth' but does not specify which algorithms or provide quantitative comparison metrics in the summary; this should be clarified with a brief reference to the relevant table or figure.
- [solver architectures section] Notation for the spherical-coordinate biconjugate gradient stabilized solver could be made more explicit when first introduced, particularly regarding the handling of coordinate singularities.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed report. We address each major comment below and will revise the manuscript to incorporate the suggested improvements.
read point-by-point responses
-
Referee: [coupling to hydrodynamics and results sections] The central claim that the solvers 'provide a solid foundation for integration into production-scale astrophysical simulations' (abstract and concluding section) is load-bearing on the representativeness of the reported convergence rates, iteration counts, and relative timings. These are measured only on a single classic star-formation test case with moderate density contrasts on 3-D structured grids; no benchmarks are shown for regimes with density contrasts orders of magnitude higher, varied boundary conditions, or the constraints of adaptive time-stepping and operator splitting with the hydrodynamics module. If iteration counts or performance degrade under those conditions, the evidence for the claim is insufficient.
Authors: We agree that the presented results are based on a single representative test case. The classic star-formation problem is a standard benchmark that exercises the coupled hydrodynamics-self-gravity system on structured grids. Nevertheless, we acknowledge the limited scope and will revise the abstract and concluding sections to moderate the claim, explicitly noting that the foundation is demonstrated for the tested conditions. We will also add a brief discussion of potential limitations for higher-contrast regimes and adaptive time-stepping, along with plans for future work. revision: yes
-
Referee: [validation and benchmarking sections] The validation section reports accuracy against analytic solutions and timing per iteration for the various solvers, but does not provide quantitative error norms (e.g., L2 or L∞ norms), grid resolutions used in the convergence studies, or details on any post-hoc tuning of relaxation parameters. Without these, it is difficult to assess whether the claimed performance similarity to other Astaroth algorithms is robust or specific to the tested setups.
Authors: We thank the referee for highlighting these omissions. In the revised manuscript we will add L2 and L∞ error norms for the analytic validation tests, specify the grid resolutions employed in the convergence studies, and document the relaxation parameters together with any tuning procedures used. These additions will make the performance comparisons more quantitative and reproducible. revision: yes
Circularity Check
No circularity in implementation, validation, or performance claims.
full rationale
The manuscript presents GPU implementations of Poisson solvers (CG, SOR, multigrid, BiCGSTAB) with novel discretization-smoother combinations, validates pointwise accuracy and convergence rates directly against external analytic solutions, and reports measured iteration counts and wall-clock timings on a single 3-D star-formation test problem. No derived quantity is obtained by fitting a parameter to a subset of the same data and then relabeling the fit as a prediction; no uniqueness theorem or ansatz is imported via self-citation to force the choice of method; and the central performance claim rests on direct benchmarking rather than on any self-referential reduction. The work is therefore self-contained as an engineering and measurement contribution.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Convergence properties of conjugate gradient, SOR, multigrid, and BiCGSTAB on structured grids
- domain assumption Equivalence of the discrete Poisson operator to the continuous gravitational potential on Cartesian and spherical meshes
Reference graph
Works this paper leans on
-
[1]
2020, The Journal of Open Source Software, 5, 2513, doi: 10.21105/joss.02513
Almgren, A., Sazo, M., Bell, J., et al. 2020, The Journal of Open Source Software, 5, 2513, doi: 10.21105/joss.02513
-
[2]
Almgren, A. S., Bell, J. B., Lijewski, M. J., Luki´ c, Z., & Van Andel, E. 2013, ApJ, 765, 39, doi: 10.1088/0004-637X/765/1/39 AMD. 2026, rocALUTION documentation, https: //rocm.docs.amd.com/projects/rocALUTION/en/latest/
-
[3]
D., McInnes, L
Balay, S., Gropp, W. D., McInnes, L. C., & Smith, B. F. 1997, in Modern Software Tools in Scientific Computing, ed. E. Arge, A. M. Bruaset, & H. P. Langtangen (Birkh¨ auser Press), 163–202
1997
-
[4]
Balay, S., Abhyankar, S., Adams, M. F., et al. 2026a, PETSc/TAO Users Manual, Tech. Rep. ANL-21/39 - Revision 3.25, Argonne National Laboratory, doi: 10.2172/2998643 —. 2026b, PETSc Web page, https://petsc.org/. https://petsc.org/
-
[5]
A., Hu, J
Berger-Vergiat, L., Glusa, C. A., Hu, J. J., et al. 2019a, MueLu User’s Guide, Tech. Rep. SAND2019-0537, Sandia National Laboratories —. 2019b, MueLu multigrid framework, http://trilinos.org/packages/muelu Br¨ oker, O., & Grote, M. J. 2002, Applied numerical mathematics, 41, 61
2002
-
[6]
Brown, J., He, Y., MacLachlan, S., Menickelly, M., & Wild, S. M. 2021, SIAM Journal on Scientific Computing, 43, A109
2021
-
[7]
V., Lewis, A
Burke, J. V., Lewis, A. S., & Overton, M. L. 2005, SIAM Journal on Optimization, 15, 751
2005
-
[8]
2015, ZEUS-3D USER MANUAL, Version 3.6, https://www.ap.smu.ca/∼dclarke/zeus3d/version3.6/ documents/dzeus36 man.pdf
Clarke, D. 2015, ZEUS-3D USER MANUAL, Version 3.6, https://www.ap.smu.ca/∼dclarke/zeus3d/version3.6/ documents/dzeus36 man.pdf
2015
-
[9]
Carter Edwards and Christian R
Edwards, H. C., Trott, C. R., & Sunderland, D. 2014, Journal of Parallel and Distributed Computing, 74, 3202 , doi: https://doi.org/10.1016/j.jpdc.2014.07.003
-
[10]
D., Li, R., Sj¨ ogreen, B., Wang, L., & Yang, U
Falgout, R. D., Li, R., Sj¨ ogreen, B., Wang, L., & Yang, U. M. 2021, Parallel Computing, 108, 102840, doi: https://doi.org/10.1016/j.parco.2021.102840
-
[11]
A., Mac Low, M.-M., Korpi-Lagg, M
Gent, F. A., Mac Low, M.-M., Korpi-Lagg, M. J., Puro, T., & Rheinhardt, M. 2026, ApJL, 1000, L40, doi: 10.3847/2041-8213/ae4c41
-
[12]
2020, Physica Scripta, 95, 055221
Ghaffar, F., Ullah, S., & Badshah, N. 2020, Physica Scripta, 95, 055221
2020
-
[13]
M., Kouatchou, J., & Zhang, J
Gupta, M. M., Kouatchou, J., & Zhang, J. 1997, Journal of Computational Physics, 132, 123
1997
-
[14]
2023, Linear Algebra and its Applications, 656, 304
He, Y., Liu, J., & Wang, X.-S. 2023, Linear Algebra and its Applications, 656, 304
2023
-
[15]
Hunter, J. D. 2007, Computing in Science & Engineering, 9, 90, doi: 10.1109/MCSE.2007.55
-
[16]
James, R. A. 1977, Journal of Computational Physics, 25, 71, doi: 10.1016/0021-9991(77)90013-4
-
[17]
P., Ranka, S., & Fox, G
Koester, D. P., Ranka, S., & Fox, G. C. 1994, in Supercomputing’94: Proceedings of the 1994 ACM/IEEE Conference on Supercomputing, IEEE, 184–193
1994
-
[18]
2012, ApJ, 757, 77, doi: 10.1088/0004-637X/757/1/77
Krasnopolsky, R., Li, Z.-Y., Shang, H., & Zhao, B. 2012, ApJ, 757, 77, doi: 10.1088/0004-637X/757/1/77
-
[19]
2021, ApJS, 252, 14, doi: 10.3847/1538-4365/abca97
Krasnopolsky, R., Ponce Mart´ ınez, M., Shang, H., Tseng, Y.-H., & Yen, C.-C. 2021, ApJS, 252, 14, doi: 10.3847/1538-4365/abca97
-
[20]
Lesur, G. R. J., Baghdadi, S., Wafflard-Fernandez, G., et al. 2023, A&A, 677, A9, doi: 10.1051/0004-6361/202346005
-
[21]
Moon, S., Kim, W.-T., & Ostriker, E. C. 2019, ApJS, 241, 24, doi: 10.3847/1538-4365/ab09e9
-
[22]
2015, SIAM Journal on Scientific Computing, 37, S602, doi: 10.1137/140980260
Naumov, M., Arsaev, M., Castonguay, P., et al. 2015, SIAM Journal on Scientific Computing, 37, S602, doi: 10.1137/140980260
-
[23]
2020, arXiv preprint arXiv:2008.00925
Nwankwo, C., & Dai, W. 2020, arXiv preprint arXiv:2008.00925
-
[24]
Oo, K. L., & Vogel, A. 2020, arXiv preprint arXiv:2007.07539 18Krasnopolsky et al. Pekkil¨ a, J. 2019, Master’s thesis, Aalto University, Espoo, Finland. https://urn.fi/URN:NBN:fi:aalto-201906233993 Pekkil¨ a, J., V¨ ais¨ al¨ a, M. S., K¨ apyl¨ a, M. J., Rheinhardt, M., & Lappi, O. 2022, Parallel Computing, 111, 102904, doi: 10.1016/J.PARCO.2022.102904 Pe...
-
[25]
Flannery, B. P. 2002, Numerical recipes in C++ : the art of scientific computing (Cambridge U. Press)
2002
-
[26]
Saad, Y. 2003, Iterative methods for sparse linear systems (Philadelphia: SIAM), doi: 10.1137/1.9780898718003
-
[27]
A., Goldbaum, N
Schive, H.-Y., ZuHone, J. A., Goldbaum, N. J., et al. 2018, Monthly Notices of the Royal Astronomical Society, 481, 4815
2018
-
[28]
Shu, F. H. 1977, ApJ, 214, 488, doi: 10.1086/155274
-
[29]
Spotz, W. F., & Carey, G. F. 1996, Numerical Methods for Partial Differential Equations, 12, 235, doi: 10.1002/(SICI)1098-2426(199603)12:2⟨235:: AID-NUM6⟩3.0.CO;2-R
-
[30]
Tamstorf, R., Benzaken, J., & McCormick, S. F. 2021, SIAM Journal on Scientific Computing, 43, S420 The MathWorks Inc. 2024, MATLAB, MathWorks, Natick, Massachusetts
2021
-
[31]
Tomida, K., & Stone, J. M. 2023, ApJS, 266, 7, doi: 10.3847/1538-4365/acc2c0
-
[32]
R., Lebrun-Grandi´ e, D., Arndt, D., et al
Trott, C. R., Lebrun-Grandi´ e, D., Arndt, D., et al. 2022, IEEE Transactions on Parallel and Distributed Systems, 33, 805, doi: 10.1109/TPDS.2021.3097283
-
[33]
W., & Schuller, A
Trottenberg, U., Oosterlee, C. W., & Schuller, A. 2000, Multigrid (Academic Press)
2000
-
[34]
2025, arXiv preprint arXiv:2511.04566
Vacek, P., Anzt, H., Carson, E., et al. 2025, arXiv preprint arXiv:2511.04566
-
[35]
Vacek, P., Carson, E., & Soodhalter, K. M. 2024, SIAM Journal on Scientific Computing, 46, A2634 V¨ ais¨ al¨ a, M. S., Pekkil¨ a, J., K¨ apyl¨ a, M. J., et al. 2021, ApJ, 907, 83, doi: 10.3847/1538-4357/abceca V¨ ais¨ al¨ a, M. S., Shang, H., Galli, D., Lizano, S., &
-
[36]
2023, ApJ, 959, 32, doi: 10.3847/1538-4357/acfb00 Van der Vorst, H
Krasnopolsky, R. 2023, ApJ, 959, 32, doi: 10.3847/1538-4357/acfb00 Van der Vorst, H. A. 1992, SIAM Journal on scientific and Statistical Computing, 13, 631
-
[37]
I., McKevitt, J., Kulikov, I., & Elbakyan, V
Vorobyov, E. I., McKevitt, J., Kulikov, I., & Elbakyan, V. 2023, A&A, 671, A81, doi: 10.1051/0004-6361/202244701
-
[38]
2020, ApJS, 247, 2, doi: 10.3847/1538-4365/ab66ba
Wang, H.-H., & Yen, C.-C. 2020, ApJS, 247, 2, doi: 10.3847/1538-4365/ab66ba
-
[39]
2025, ApJ, 987, 123, doi: 10.3847/1538-4357/add88a
Wang, Y.-C., Shang, H., & Krasnopolsky, R. 2025, ApJ, 987, 123, doi: 10.3847/1538-4357/add88a
-
[40]
Wibking, B. D., & Krumholz, M. R. 2022, MNRAS, 512, 1430, doi: 10.1093/mnras/stac439
-
[41]
AMReX: a framework for block-structured adaptive mesh refinement
Zhang, W., Almgren, A., Beckner, V., et al. 2019, Journal of Open Source Software, 4, 1370, doi: 10.21105/joss.01370
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.