Recognition: unknown
A graph-based Neural Network surrogate model for accelerating semi-analytical model of galaxy formation and evolution
Pith reviewed 2026-05-08 07:30 UTC · model grok-4.3
The pith
A conditional graph neural network acts as an accurate fast surrogate for full semi-analytic galaxy formation models by learning from dark matter merger trees and model parameters.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a single conditional graph neural network trained on Galacticus semi-analytic model outputs for Uchuu merger trees can predict galaxy properties including stellar mass with 0.19-0.28 dex scatter and R-squared of 0.946-0.973 over 0 <= z <= 3, and comparably well for other properties up to z=5, while generalizing across multiple SAM realizations, unseen merger trees, and redshifts without major loss of fidelity relative to the full model.
What carries the argument
The conditional graph neural network that encodes merger tree structure as graph input and conditions predictions on semi-analytic model parameter values to output time-evolving galaxy properties.
If this is right
- Studies of galaxy formation can examine far larger numbers of merger trees and wider ranges of model parameters than direct computation currently allows.
- Catalog-scale predictions of galaxy populations become practical while retaining close agreement with the underlying semi-analytic model for statistical purposes.
- Detailed mapping of how changes in SAM parameters affect observable galaxy traits across cosmic time becomes more accessible.
- The released inference code enables other researchers to train and deploy similar surrogates for their own merger tree sets and models.
Where Pith is reading between the lines
- The same graph structure could be tested as a surrogate for other semi-analytic codes or even hydrodynamical simulation outputs if equivalent tree data is supplied.
- Fast inference might support iterative fitting of SAM parameters to large observational catalogs in near real time.
- Reduced computational barriers could help isolate which physical assumptions in galaxy formation models are most tightly constrained by existing data.
Load-bearing premise
The trained network generalizes to unseen merger trees, parameter values, and redshifts while matching full model outputs without significant biases or fidelity loss.
What would settle it
Run the trained network on a fresh collection of merger trees and held-out SAM parameter combinations, then compare its stellar mass and other property predictions directly against full Galacticus runs on the same inputs to check whether scatter remains under 0.3 dex with no systematic offsets.
Figures
read the original abstract
Understanding how galaxy populations emerge and evolve from the growth of dark matter structure is a central challenge in galaxy formation theory. Semi-analytic models (SAMs) provide an efficient framework to address this problem, but exploring large ensembles of merger trees across broad parameter spaces remains computationally demanding. We develop a conditional graph neural network surrogate model that combines merger tree information with SAM parameters to predict galaxy properties across cosmic time. Using merger trees of dark matter halos from the Uchuu simulation and the Galacticus SAM, the model predicts stellar mass, luminosity, angular momentum, gas metal mass, and specific star formation rate across the wide redshift range of 0 <= z <= 5. For instance, the model can predict stellar mass at 0 <= z <= 3 with a scatter of 0.19-0.28 dex and coefficient of determination R^2 of 0.946-0.973 (R^2 close to 1 indicates prediction closely matching the truth). The results show that a single graph based model can reproduce these galaxy properties with good accuracy over multiple SAM realizations, merger trees and redshifts. This catalog-level model provides a practical route for accelerating SAM based studies of galaxy formation to enable a more detailed investigation of the model parameter space. The inference code, trained models, and example data products are publicly available at https://github.com/MutongCat/sam2galaxy-gnn.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops a conditional graph neural network (GNN) surrogate for the Galacticus semi-analytical model (SAM) of galaxy formation and evolution. Merger trees are taken from the Uchuu N-body simulation; the GNN is conditioned on both tree structure and SAM parameter values to predict galaxy properties (stellar mass, luminosity, angular momentum, gas metal mass, specific star-formation rate) over 0 ≤ z ≤ 5. The authors report that a single model achieves scatter of 0.19–0.28 dex and R² = 0.946–0.973 for stellar mass at z ≤ 3 across multiple SAM realizations, trees, and redshifts, and argue that the approach accelerates large-scale parameter-space exploration. Trained models and inference code are released publicly.
Significance. If the generalization to unseen merger trees, SAM parameters, and redshifts is robust, the surrogate would materially reduce the computational cost of running large ensembles of Galacticus realizations, enabling denser sampling of feedback and star-formation parameter spaces that are currently prohibitive. The public release of code and models further increases the potential utility for the community.
major comments (2)
- [Results (performance metrics) and Methods (data splitting)] The headline claim that the conditional GNN reproduces galaxy properties “with good accuracy over multiple SAM realizations, merger trees and redshifts” (abstract) rests on the assumption that test performance reflects generalization rather than interpolation. The manuscript does not specify how the training/validation/test splits were constructed with respect to the SAM parameter values (e.g., feedback efficiencies, star-formation thresholds) or redshift ranges. Without an explicit out-of-distribution test (different parameter combinations or redshifts never seen in training), the quoted R² and scatter values cannot be taken as evidence for the broader surrogate utility asserted.
- [Results section] Performance is reported only in aggregate for 0 ≤ z ≤ 3. A redshift-binned or parameter-binned breakdown (e.g., in the main results table or supplementary figures) is needed to identify regimes where fidelity degrades, especially near z = 3 or at the edges of the sampled SAM parameter space.
minor comments (2)
- [Abstract] The abstract states predictions extend to z = 5 yet supplies quantitative metrics only up to z = 3; a concise statement of performance at 3 < z ≤ 5 would improve completeness.
- [Methods] Notation for how SAM parameters are injected into the GNN (conditioning mechanism, embedding dimension, etc.) is introduced only briefly; a short schematic or equation would aid readers outside the graph-network community.
Simulated Author's Rebuttal
We thank the referee for their thorough and constructive review. The comments have helped us clarify key aspects of our methodology and results. We address each major comment point-by-point below and have revised the manuscript to incorporate the requested details on data splitting and performance breakdowns.
read point-by-point responses
-
Referee: [Results (performance metrics) and Methods (data splitting)] The headline claim that the conditional GNN reproduces galaxy properties “with good accuracy over multiple SAM realizations, merger trees and redshifts” (abstract) rests on the assumption that test performance reflects generalization rather than interpolation. The manuscript does not specify how the training/validation/test splits were constructed with respect to the SAM parameter values (e.g., feedback efficiencies, star-formation thresholds) or redshift ranges. Without an explicit out-of-distribution test (different parameter combinations or redshifts never seen in training), the quoted R² and scatter values cannot be taken as evidence for the broader surrogate utility asserted.
Authors: We agree that explicit documentation of the splitting strategy is essential for assessing generalization. The original manuscript described the use of multiple SAM realizations but did not detail the precise allocation of parameter sets and redshifts across splits. In the revised version, we have added a dedicated paragraph in the Methods section specifying that the 80/10/10 train/validation/test split was performed at the level of individual merger trees, with SAM parameter combinations (including feedback efficiencies and star-formation thresholds) drawn from a Latin-hypercube sampling. The test set contains both unseen trees and a subset of parameter combinations held out from training, providing a partial out-of-distribution evaluation. We have also added a new supplementary figure showing performance on a fully held-out SAM parameter set never encountered during training. These additions directly support the generalization claim while acknowledging that a comprehensive sweep of all possible parameter combinations remains computationally prohibitive. revision: yes
-
Referee: [Results section] Performance is reported only in aggregate for 0 ≤ z ≤ 3. A redshift-binned or parameter-binned breakdown (e.g., in the main results table or supplementary figures) is needed to identify regimes where fidelity degrades, especially near z = 3 or at the edges of the sampled SAM parameter space.
Authors: We concur that aggregate statistics can obscure redshift- or parameter-dependent variations. The revised manuscript now includes a new Table 2 in the Results section that reports scatter and R² values for stellar mass, luminosity, and specific star-formation rate in redshift bins of width Δz = 1 from z = 0 to z = 3. Performance remains stable (scatter 0.19–0.25 dex) up to z ≈ 2.5, with a modest increase to 0.28 dex near z = 3, consistent with the smaller number of galaxies and higher merger activity at earlier times. We have additionally placed parameter-binned diagnostics (varying feedback efficiency while holding other parameters fixed) in the supplementary material. These breakdowns confirm that the quoted headline metrics are representative across the sampled range while highlighting the expected mild degradation at the highest redshifts. revision: yes
Circularity Check
No circularity: standard supervised surrogate trained on external simulation data
full rationale
The paper trains a conditional GNN on merger trees from the independent Uchuu N-body simulation and galaxy properties generated by the Galacticus SAM. Reported metrics (0.19-0.28 dex scatter, R^2 0.946-0.973 for stellar mass) are obtained by direct comparison of model outputs to held-out SAM realizations on test merger trees. No derivation step reduces by construction to a fitted parameter, self-citation, or ansatz imported from the authors' prior work; the surrogate is falsifiable against the external SAM and does not rename or tautologically reproduce its own inputs.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Agarwal, S., Davé, R., & Bassett, B. A. 2018, Monthly Notices of the Royal Astronomical Society, 478, 3410
2018
-
[2]
Behroozi, P., Wechsler, R. H., Hearin, A. P., & Conroy, C. 2019, Monthly Notices of the Royal Astronomical Society, 488, 3143, doi: 10.1093/mnras/stz1182
-
[3]
Behroozi, P. S., Wechsler, R. H., & Wu, H.-Y. 2013a, The Astrophysical Journal, 762, 109, doi: 10.1088/0004-637X/762/2/109
-
[4]
Behroozi, P. S., Wechsler, R. H., & Wu, H.-Y. 2013b, The Astrophysical Journal, 763, 18, doi: 10.1088/0004-637X/763/1/18 21
-
[5]
Benson, A. J. 2010, Physics Reports, 495, 33, doi: 10.1016/j.physrep.2010.06.001
-
[6]
Benson, A. J. 2012, New Astronomy, 17, 175
2012
-
[7]
K., Lin , Y.-T., Ho , S., & Genel , S
Genel, S. 2024, The Astrophysical Journal, 965, 101, doi: 10.3847/1538-4357/ad2b6c
-
[8]
Cole, S., Lacey, C. G., Baugh, C. M., & Frenk, C. S. 2000, Monthly Notices of the Royal Astronomical Society, 319, 168, doi: 10.1046/j.1365-8711.2000.03879.x
-
[9]
Dai, B., & Seljak, U. 2021, Proceedings of the National Academy of Sciences, 118, e2020324118, doi: 10.1073/pnas.2020324118 de Oliveira, R. A., Li, Y., Villaescusa-Navarro, F., Ho, S., & Spergel, D. N. 2020, arXiv e-prints. https://arxiv.org/abs/2012.00240 de Santi, N. S. M., & et al. 2025, https://arxiv.org/abs/2512.10222
-
[10]
Delgado, A. M., Wadekar, D., Hadzhiyska, B., et al. 2022, Monthly Notices of the Royal Astronomical Society, 515, 2733, doi: 10.1093/mnras/stac1951
-
[11]
Dressler, A. 1980, Astrophysical Journal, 236, 351, doi: 10.1086/157753
-
[12]
Elliott, E. J., Baugh, C. M., & Lacey, C. G. 2021, Monthly Notices of the Royal Astronomical Society, 506, 4011, doi: 10.1093/mnras/stab1912 Euclid Collaboration, Mellier, Y., Abdurro’uf, et al. 2025, A&A, 697, A1, doi: 10.1051/0004-6361/202450810
-
[13]
Genel, S., Vogelsberger, M., Springel, V., et al. 2014, Monthly Notices of the Royal Astronomical Society, 445, 175, doi: 10.1093/mnras/stu1654
-
[14]
L., Ying, R., & Leskovec, J
Hamilton, W. L., Ying, R., & Leskovec, J. 2017, Advances in Neural Information Processing Systems, 30
2017
-
[15]
He, H., & Garcia, E. A. 2009, IEEE Transactions on Knowledge and Data Engineering, 21, 1263, doi: 10.1109/TKDE.2008.239
-
[16]
2019, Proceedings of the National Academy of Sciences, 116, 13825, doi: 10.1073/pnas.1815141116
He, S., Li, Y., Feng, Y., et al. 2019, Proceedings of the National Academy of Sciences, 116, 13825, doi: 10.1073/pnas.1815141116
-
[17]
Gaussian Error Linear Units (GELUs)
Hendrycks, D., & Gimpel, K. 2016, arXiv e-prints. https://arxiv.org/abs/1606.08415
work page internal anchor Pith review arXiv 2016
-
[18]
Roseboom, I. 2009, Monthly Notices of the Royal Astronomical Society, 396, 535, doi: 10.1111/j.1365-2966.2009.14730.x
-
[19]
2023, MNRAS: Letters, 522, L11
Houston, T., et al. 2023, MNRAS: Letters, 522, L11
2023
-
[20]
2013, A&A, 556, A55
Ilbert, O., & et al. 2013, A&A, 556, A55
2013
-
[21]
Ishiyama, T., Prada, F., Klypin, A., et al. 2021, Monthly Notices of the Royal Astronomical Society, 506, 4210, doi: 10.1093/mnras/stab1755
-
[22]
Jacobs, R. A., Jordan, M. I., Nowlan, S. J., & Hinton, G. E. 1991, Neural Computation, 3, 79, doi: 10.1162/neco.1991.3.1.79
-
[23]
Jagvaral, Y., Lanusse, F., Singh, S., et al. 2022a, arXiv e-prints. https://arxiv.org/abs/2212.05596
-
[24]
Jagvaral, Y., Lanusse, F., Singh, S., et al. 2022b, arXiv e-prints. https://arxiv.org/abs/2204.07077
-
[25]
K., Cranmer , M., Melchior , P., et al
Jespersen, C. K., Cranmer, M., Melchior, P., et al. 2022, The Astrophysical Journal, 941, 7, doi: 10.3847/1538-4357/ac9b18
-
[26]
2019, Monthly Notices of the Royal Astronomical Society, 489, 3565, doi: 10.1093/mnras/stz2304
Jo, Y., & Kim, J. 2019, Monthly Notices of the Royal Astronomical Society, 489, 3565, doi: 10.1093/mnras/stz2304
-
[27]
Kamdar, H. M., Turk, M. J., & Brunner, R. J. 2016, Monthly Notices of the Royal Astronomical Society, 455, 642, doi: 10.1093/mnras/stv2310
-
[28]
Kauffmann, G., White, S. D. M., Heckman, T. M., et al. 2004, Monthly Notices of the Royal Astronomical Society, 353, 713, doi: 10.1111/j.1365-2966.2004.08117.x
-
[29]
Decoupled Weight Decay Regularization
Loshchilov, I., & Hutter, F. 2019, arXiv e-prints. https://arxiv.org/abs/1711.05101
work page internal anchor Pith review arXiv 2019
-
[30]
C., Wilkins, S
Lovell, C. C., Wilkins, S. M., Thomas, P. A., et al. 2022, Monthly Notices of the Royal Astronomical Society, 509, 5046
2022
-
[31]
Lu, Y., Kereš, D., Katz, N., & Mo, H. J. 2011, Monthly Notices of the Royal Astronomical Society, 416, 1949, doi: 10.1111/j.1365-2966.2011.19170.x
-
[32]
Monthly Notices of the Royal Astronomical Society , volume =
Moster, B. P., Naab, T., & White, S. D. M. 2013, Monthly Notices of the Royal Astronomical Society, 428, 3121, doi: 10.1093/mnras/sts261
-
[33]
2021, Monthly Notices of the Royal Astronomical Society, 507, 1021
Ni, Y., Li, Y., Lachance, P., et al. 2021, Monthly Notices of the Royal Astronomical Society, 507, 1021
2021
-
[34]
doi:10.3847/1538-4357/ad022a , archiveprefix =
Ni, Y., Genel, S., Anglés-Alcázar, D., et al. 2023, The Astrophysical Journal, 959, 136, doi: 10.3847/1538-4357/ad022a
-
[35]
Nix, D. A., & Weigend, A. S. 1994, in Proceedings of 1994 IEEE International Conference on Neural Networks, Vol. 1, 55–60, doi: 10.1109/ICNN.1994.374138
-
[36]
2007, MNRAS, 378, 910, doi: 10.1111/j.1365-2966.2007.11817.x
Parkinson, H., Cole, S., & Helly, J. 2008, Monthly Notices of the Royal Astronomical Society, 383, 557, doi: 10.1111/j.1365-2966.2007.12517.x
-
[37]
Peng, Y., Lilly, S. J., Carollo, M., & Renzini, A. 2012, Astrophysical Journal, 757, 4, doi: 10.1088/0004-637X/757/1/4
-
[38]
2018a, MNRAS, 475, 648, doi: 10.1093/mnras/stx3112
Pillepich, A., Nelson, D., Hernquist, L., et al. 2018, Monthly Notices of the Royal Astronomical Society, 475, 648, doi: 10.1093/mnras/stx3112
-
[39]
Poole, G. B., Angel, P. W., Hartley, W. G., et al. 2017, Monthly Notices of the Royal Astronomical Society, 472, 3659, doi: 10.1093/mnras/stx2213 22
-
[40]
Robertson, A., & Benson, A. 2026, The Open Journal of Astrophysics, 9, doi: 10.33232/001c.155306 Rodríguez-Puebla, A., Primack, J. R., Behroozi, P., &
-
[41]
Faber, S. M. 2017, Monthly Notices of the Royal Astronomical Society, 470, 651
2017
-
[42]
Schaye, J., Crain, R. A., Bower, R. G., et al. 2015, Monthly Notices of the Royal Astronomical Society, 446, 521, doi: 10.1093/mnras/stu2058
-
[43]
, archivePrefix = "arXiv", eprint =
Somerville, R. S., & Davé, R. 2015, Annual Review of Astronomy and Astrophysics, 53, 51, doi: 10.1146/annurev-astro-082812-140951
work page Pith review doi:10.1146/annurev-astro-082812-140951 2015
-
[44]
Wide-Field InfrarRed Survey Telescope-Astrophysics Focused Telescope Assets WFIRST-AFTA 2015 Report
Spergel, D., Gehrels, N., Baltay, C., et al. 2015, arXiv e-prints, arXiv:1503.03757, doi: 10.48550/arXiv.1503.03757
-
[45]
2014, Statistical Science, 29, 81, doi: 10.1214/12-STS412
Vernon, I., Goldstein, M., & Bower, R. 2014, Statistical Science, 29, 81, doi: 10.1214/12-STS412
-
[46]
doi:10.3847/1538-4357/ac4973 , archiveprefix =
Wang, Y., Zhai, Z., Alavi, A., et al. 2022, ApJ, 928, 1, doi: 10.3847/1538-4357/ac4973
-
[47]
Wechsler, R. H., & Tinker, J. L. 2018, ARA&A, 56, 435, doi: 10.1146/annurev-astro-081817-051756
work page Pith review doi:10.1146/annurev-astro-081817-051756 2018
-
[48]
White, S. D. M., & Frenk, C. S. 1991, The Astrophysical Journal, 379, 52, doi: 10.1086/170483
-
[49]
2013, The Astrophysical Journal, 772, 147
Xu, X., Ho, S., Trac, H., et al. 2013, The Astrophysical Journal, 772, 147
2013
-
[50]
J., & van den Bosch, F
Yang, X., Mo, H. J., & van den Bosch, F. C. 2003, Monthly Notices of the Royal Astronomical Society, 339, 1057
2003
-
[51]
Zhai, Z., Benson, A., & Wang, Y. 2025, arXiv e-prints. https://arxiv.org/abs/2505.18748
-
[52]
2005, The Astrophysical Journal, 633, 791 23 Figure 12.Predictedversustruevaluesforthefivetargetpropertiesatfiverepresentativeredshiftoutputs
Zheng, Z., et al. 2005, The Astrophysical Journal, 633, 791 23 Figure 12.Predictedversustruevaluesforthefivetargetpropertiesatfiverepresentativeredshiftoutputs. Columnscorrespond to different redshifts and rows correspond to different target properties. Each panel shows the one-to-one relation together with the corresponding predicted and true values from...
2005
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.