pith. machine review for the scientific record. sign in

arxiv: 2604.22917 · v1 · submitted 2026-04-24 · 🌌 astro-ph.CO · astro-ph.GA

Recognition: unknown

Diffusion-based Galaxy Simulations for the Roman High Latitude Survey

Diana Scognamiglio, Eric Huff, Jake H. Lee, Sergi R. Hildebrandt, Shoubaneh Hemmati

Authors on Pith no claims yet

Pith reviewed 2026-05-08 09:55 UTC · model grok-4.3

classification 🌌 astro-ph.CO astro-ph.GA
keywords diffusion modelsgalaxy image simulationsRoman Space Telescopeweak lensingJWST observationscosmological survey preparationimage generation
0
0 comments X

The pith

Diffusion models generate realistic multi-band galaxy images for Roman weak lensing by learning from transformed JWST observations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that a denoising diffusion probabilistic model, trained on JWST data converted to Roman observing conditions, can produce galaxy postage stamps whose magnitudes, sizes, ellipticities, surface brightnesses, and three-band colors match the statistical structure of real galaxies. This matters for future Roman Space Telescope analyses because existing analytic light-profile simulations cannot capture the complex morphologies needed to control shear systematics at the precision required for weak lensing cosmology. A reader following the argument sees a concrete path from high-resolution multi-band observations to large, scalable simulation catalogs that preserve both marginal distributions and covariances among galaxy properties.

Core claim

We construct Roman-like galaxy images from multi-band JWST/NIRCam observations in the GOODS fields through PSF matching, pixel-scale conversion, and interloper masking that preserves correlated noise. These images train a denoising diffusion probabilistic model that generates multi-band postage stamps in the Roman Y, J, and H filters. Validation against an independent dataset shows that the generated sample reproduces the marginal distributions and covariance structure of magnitude, size, ellipticity, peak surface brightness, and colors, with only modest deviations in low-occupancy regions of parameter space.

What carries the argument

Denoising diffusion probabilistic model trained on multi-band Roman-like galaxy postage stamps obtained by transforming JWST/NIRCam data via PSF matching, pixel-scale conversion, and interloper masking.

If this is right

  • The method supplies high-fidelity galaxy populations for Roman weak lensing calibration without relying on analytic light-profile assumptions.
  • It scales to produce the large simulation volumes required for upcoming cosmological surveys.
  • The same framework can be applied to other future experiments by retraining on appropriately transformed high-resolution data.
  • Generated samples preserve both one-point and joint statistics of observable galaxy properties.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach could lower the computational cost of producing large simulation suites once the diffusion model is trained.
  • It opens a route to incorporate rare morphological features observed in JWST data that analytic models miss.
  • Similar data-transformation pipelines might allow diffusion models to be adapted for ground-based surveys with different noise and seeing characteristics.

Load-bearing premise

The JWST-to-Roman image transformation preserves every galaxy property that matters for weak lensing shear measurement and does not introduce artifacts that the diffusion model then reproduces.

What would settle it

If the generated galaxies show statistically significant mismatches in ellipticity distributions or color covariances compared with an independent validation catalog processed through the same photometric pipeline, the claim that the model supplies usable simulations for shear calibration would fail.

Figures

Figures reproduced from arXiv: 2604.22917 by Diana Scognamiglio, Eric Huff, Jake H. Lee, Sergi R. Hildebrandt, Shoubaneh Hemmati.

Figure 1
Figure 1. Figure 1: Transmission curves of the Roman WFI filters (Y, J, H) and JWST NIRCam filters (F090W, F115W, F150W, F200W) used to simulate Roman-like galaxies based on JWST galaxy images. sources with sufficient signal-to-noise, we require magnitudes brighter than hmag < 27.0, where the magnitude cut is ap￾plied specifically to the F200W KRON flux, converted to AB magnitude. This choice ensures selection in the deepest … view at source ↗
Figure 2
Figure 2. Figure 2: In the original images, projected neighbors and bright companions are clearly visible, while in the masked versions these contaminants are removed and replaced by statistically consistent correlated noise, leaving the central galaxy mor￾phology untouched. The resulting Roman-like dataset consists of 19,888 masked Y, J, and H cutouts. However, the final datasets used for training and validation are obtained… view at source ↗
Figure 3
Figure 3. Figure 3: a) Illustration of the Denoising Diffusion Probabilistic Model (DDPM) architecture. b) Training diagram for the parametrized model 𝜖 𝜃 . c) Sampling procedure. configuration, we are able to generate galaxy stamps in as few as 25 steps without any loss in generated image quality. Our model was implemented with the Hugging Face Dif￾fusers library (von Platen et al. 2022), which abstracts most of these concep… view at source ↗
Figure 4
Figure 4. Figure 4: Comparison between the DDPM-generated Roman-like galaxies (Gen; top) and validation galaxies (Val; bottom). The generated images are produced using the DDPM model trained on Roman-like galaxy data, exhibiting realistic morphological features consistent with the training set. Each postage stamp is 56 × 56 pixels, corresponding to 6.16′′ × 6.16′′ at the Roman pixel scale view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of galaxy property distributions between the validation (Val), generated (Gen), and training (Train) samples in the Y band. The upper panels display the joint one- and two-dimensional distributions of Kron AB magnitude 𝑚AB, moment-based radius 𝑅𝑚 (pixel), ellipticity 𝑒, and peak surface brightness 𝑆𝐵peak (mag arcsec−2 ). The diagonal panels show normalized one-dimensional histograms, while the o… view at source ↗
Figure 6
Figure 6. Figure 6: Same as view at source ↗
Figure 7
Figure 7. Figure 7: Same as view at source ↗
Figure 8
Figure 8. Figure 8: Color-magnitude and color-color distributions for the validation (Val) and generated (Gen) samples. The panels show Y–J and J–H as a function of J–band magnitude (left and middle) and the Y–J versus J–H relation (right). Density contours illustrate the two-dimensional distributions, with normalized one-dimensional marginals shown along the top and right of each panel. ated on the background-subtracted imag… view at source ↗
read the original abstract

Future weak lensing analyses with the Nancy Grace Roman Space Telescope will require highly realistic image simulations to control shear systematics at unprecedented precision. A key limitation of existing approaches is their reliance on analytic light-profile models, which cannot fully capture the complex, non-parametric morphologies revealed by high-resolution observations. We present a diffusion-based framework for generating realistic galaxy image simulations tailored to the weak lensing requirements of the Roman High Latitude Survey. We construct Roman-like galaxy images from multi-band JWST/NIRCam observations in the GOODS-S and GOODS-N fields, transforming them into the Roman observing regime through point-spread-function matching, pixel-scale conversion, and interloper masking that preserves correlated noise properties. These data are used to train a denoising diffusion probabilistic model to generate multi-band galaxy postage stamps in the Roman Y, J, and H filters. We validate the generated sample against an independent dataset using a consistent photometric pipeline, comparing key galaxy observables including magnitude, size, ellipticity, peak surface brightness, and three-band colors. The generated galaxies reproduce both the marginal distributions and the covariance structure of these properties, with only modest deviations in low-occupancy regions of parameter space. These results demonstrate that diffusion models provide a scalable and physically motivated alternative to analytic simulations, enabling high-fidelity galaxy populations for Roman weak lensing calibration and, more generally, for survey preparation in upcoming cosmological experiments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript introduces a denoising diffusion probabilistic model trained on multi-band JWST/NIRCam observations from GOODS-S and GOODS-N, transformed to Roman-like conditions via PSF matching, pixel-scale conversion, and interloper masking that preserves correlated noise. The model generates multi-band (Y, J, H) galaxy postage stamps, which are validated against an independent dataset using a consistent photometric pipeline by comparing marginal distributions and covariances of magnitude, size, ellipticity, peak surface brightness, and colors, with only modest deviations reported in low-occupancy regions. The central claim is that this provides a scalable, data-driven alternative to analytic light-profile models for high-fidelity galaxy populations needed in Roman weak lensing calibration.

Significance. If the generated images can be shown to yield unbiased shear estimates, the approach would represent a meaningful advance by capturing non-parametric morphologies from real high-resolution data rather than relying on parametric assumptions. The data-driven training on transformed JWST observations and the reproduction of covariance structure are strengths that could improve realism in survey preparation simulations. However, the current evidence does not yet establish control of shear systematics at the precision required for Roman weak lensing.

major comments (1)
  1. The validation procedure (as described in the abstract) compares only marginal distributions and covariances of five scalar observables (magnitude, size, ellipticity, peak surface brightness, and colors). This does not establish that the generated images produce unbiased shear estimates when passed through a Roman-like measurement pipeline, as higher-order morphological features, residual noise correlations after PSF matching, and the response of shape estimators (e.g., metacalibration) to potential diffusion artifacts at low surface brightness or in light-profile wings remain untested. A direct quantification of multiplicative and additive shear biases on the generated sample is required to support the claim of enabling high-fidelity populations for weak lensing calibration.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive review and for recognizing the potential of our diffusion-based approach. We address the major comment below, agreeing on the need for stronger validation while clarifying the scope of the current work.

read point-by-point responses
  1. Referee: The validation procedure (as described in the abstract) compares only marginal distributions and covariances of five scalar observables (magnitude, size, ellipticity, peak surface brightness, and colors). This does not establish that the generated images produce unbiased shear estimates when passed through a Roman-like measurement pipeline, as higher-order morphological features, residual noise correlations after PSF matching, and the response of shape estimators (e.g., metacalibration) to potential diffusion artifacts at low surface brightness or in light-profile wings remain untested. A direct quantification of multiplicative and additive shear biases on the generated sample is required to support the claim of enabling high-fidelity populations for weak lensing calibration.

    Authors: We agree that direct quantification of multiplicative and additive shear biases via a Roman-like shape measurement pipeline (e.g., metacalibration) would constitute the strongest test of suitability for weak lensing calibration. Our validation was intentionally focused on reproducing the marginal distributions and covariances of photometric and morphological properties that serve as direct inputs to such pipelines, including ellipticity and size, which are central to shear estimation. We acknowledge that higher-order morphological details, residual noise correlations, and potential low-surface-brightness artifacts from the diffusion process could introduce unquantified systematics. In the revised manuscript we have added a dedicated limitations subsection (Section 5.3) that explicitly discusses these gaps, includes qualitative inspection of generated light-profile wings, and outlines planned future work to perform end-to-end shear bias measurements once a full Roman simulation framework is available. We have also tempered the abstract and conclusion language from 'enabling high-fidelity galaxy populations for Roman weak lensing calibration' to 'providing a scalable, data-driven foundation that can be integrated into future weak lensing calibration pipelines.' We view the current results as an important intermediate step demonstrating statistical fidelity of the generated population, but we do not claim that shear bias control has been demonstrated. revision: partial

Circularity Check

0 steps flagged

No significant circularity in the derivation chain.

full rationale

The paper presents a data-driven pipeline: JWST observations are transformed via PSF matching, pixel rescaling, and masking to create training images in the Roman regime, a denoising diffusion model is trained on these data, and outputs are validated by comparing marginal distributions and covariances of five scalar observables (magnitude, size, ellipticity, peak surface brightness, colors) against an independent test set. No equations, ansatzes, or self-citations are invoked that reduce the generated galaxy images or the central claim to the training inputs by construction. The validation step is an external statistical comparison rather than a tautological re-expression of fitted parameters, and the approach remains self-contained against external benchmarks without load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on the standard assumptions of denoising diffusion probabilistic models and the fidelity of the JWST-to-Roman data transformation; no new physical entities are introduced.

free parameters (1)
  • diffusion model training hyperparameters
    Number of diffusion steps, noise schedule, network architecture, and learning rate are not specified in the abstract but are required for the model.
axioms (2)
  • domain assumption Denoising diffusion probabilistic models can faithfully capture the joint distribution of complex, non-parametric galaxy morphologies across bands
    This is the core modeling assumption invoked when claiming the generated sample reproduces observed covariances.
  • domain assumption The PSF matching, pixel-scale conversion, and interloper masking preserve correlated noise and morphological properties relevant to weak lensing
    Invoked in the data preparation step before training.

pith-pipeline@v0.9.0 · 5559 in / 1418 out tokens · 59469 ms · 2026-05-08T09:55:03.609400+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

38 extracted references · 31 canonical work pages · 4 internal anchors

  1. [1]

    The Wide Field Infrared Survey Telescope: 100 Hubbles for the 2020s

    Akeson, R., Armus, L., Bachelet, E., et al. 2019, The Wide Field Infrared Survey Telescope: 100 Hubbles for the 2020s. https://arxiv.org/abs/1902.05569

  2. [2]

    2025, MNRAS, 544, 3799–3823, doi: 10.1093/mnras/staf1833 Astropy Collaboration, Robitaille, T

    Alarcon, A., Aldoroty, L., Beltz-Mohrmann, G., et al. 2025, MNRAS, 544, 3799–3823, doi: 10.1093/mnras/staf1833 Astropy Collaboration, Robitaille, T. P., Tollerud, E. J., et al. 2013, A&A, 558, A33, doi: 10.1051/0004-6361/201322068 Astropy Collaboration, Price-Whelan, A. M., Sip˝ocz, B. M., et al. 2018, AJ, 156, 123, doi: 10.3847/1538-3881/aabc4f

  3. [3]

    The Journal of Open Source Software , year = 2016, month = oct, volume =

    Barbary, K. 2016, The Journal of Open Source Software, 1, 58, doi: 10.21105/joss.00058

  4. [4]

    Weak gravitational lensing

    Bartelmann, M., & Schneider, P. 2001, PhR, 340, 291, doi: 10.1016/S0370-1573(00)00082-X

  5. [5]

    2025, MNRAS, 542, 608–628, doi: 10.1093/mnras/staf1255

    Berlfein, F., Mandelbaum, R., Li, X., et al. 2025, MNRAS, 542, 608–628, doi: 10.1093/mnras/staf1255

  6. [6]

    , keywords =

    Bertin, E., & Arnouts, S. 1996, A&AS, 117, 393, doi: 10.1051/aas:1996164

  7. [7]

    2024, astropy/photutils: 2.0.2, 2.0.2, Zenodo, doi: 10.5281/zenodo.13989456

    Bradley, L., Sip˝ocz, B., Robitaille, T., et al. 2024, astropy/photutils: 2.0.2, 2.0.2, Zenodo, doi: 10.5281/zenodo.13989456

  8. [8]

    2021, in Advances in Neural Information Processing Systems, ed

    Dhariwal, P., & Nichol, A. 2021, in Advances in Neural Information Processing Systems, ed. M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, & J. W. Vaughan, Vol. 34 (Curran Associates, Inc.), 8780–8794. https://proceedings.neurips.cc/paper files/paper/2021/file/ 49ad23d1ec9fa4bd8d77d02681df5cfa-Paper.pdf Dor´e, O., Hirata, C., Wang, Y., et al. 2018, arXi...

  9. [9]

    J., Willott, C., Alberts, S., et al

    Eisenstein, D. J., Willott, C., Alberts, S., et al. 2023, Overview of the JWST Advanced Deep Extragalactic Survey (JADES). https://arxiv.org/abs/2306.02465

  10. [10]

    Generative Adversarial Networks

    Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., et al. 2014, Generative Adversarial Networks. https://arxiv.org/abs/1406.2661

  11. [11]

    2019, ApJ, 877, 117, doi: 10.3847/1538-4357/ab1be5

    Hemmati, S., Capak, P., Masters, D., et al. 2019, ApJ, 877, 117, doi: 10.3847/1538-4357/ab1be5

  12. [12]

    2022, ApJ, 941, 141, doi: 10.3847/1538-4357/aca1b8

    Hemmati, S., Huff, E., Nayyeri, H., et al. 2022, ApJ, 941, 141, doi: 10.3847/1538-4357/aca1b8

  13. [13]

    M., Yamamoto, M., Laliotis, K., et al

    Hirata, C. M., Yamamoto, M., Laliotis, K., et al. 2024, MNRAS, 528, 2533–2561, doi: 10.1093/mnras/stae182

  14. [14]

    2020, in Advances in Neural Information Processing Systems, ed

    Ho, J., Jain, A., & Abbeel, P. 2020, in Advances in Neural Information Processing Systems, ed. H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, & H. Lin, Vol. 33 (Curran Associates, Inc.), 6840–6851. https://proceedings.neurips.cc/paper files/ paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf

  15. [15]

    Classifier-Free Diffusion Guidance

    Ho, J., & Salimans, T. 2022, arXiv preprint arXiv:2207.12598

  16. [16]

    2017, MNRAS, 468, 3295–3311, doi: 10.1093/mnras/stx724 Ivezi´c, ˇZ., Kahn, S

    Hoekstra, H., Viola, M., & Herbonnet, R. 2017, MNRAS, 468, 3295–3311, doi: 10.1093/mnras/stx724 Ivezi´c, ˇZ., Kahn, S. M., Tyson, J. A., et al. 2019, ApJ, 873, 111, doi: 10.3847/1538-4357/ab042c

  17. [17]

    2022, Advances in neural information processing systems, 35, 26565

    Karras, T., Aittala, M., Aila, T., & Laine, S. 2022, Advances in neural information processing systems, 35, 26565

  18. [18]

    Kron, R. G. 1980, ApJS, 43, 305, doi: 10.1086/190669

  19. [19]

    Lallo, M. D. 2012, Optical Engineering, 51, 011011, doi: 10.1117/1.oe.51.1.011011

  20. [20]

    Euclid Definition Study Report

    Laureijs, R., Amiaux, J., Arduini, S., et al. 2011, arXiv:1110.3193. https://arxiv.org/abs/1110.3193

  21. [21]

    H., Nowack, J., et al

    Lizarraga, A., Jiang, E. H., Nowack, J., et al. 2024, arXiv preprint arXiv:2411.18440

  22. [22]

    2017, in International Conference on Learning Representations

    Loshchilov, I., & Hutter, F. 2017, in International Conference on Learning Representations. https://api.semanticscholar.org/CorpusID:53592270

  23. [23]

    2022, Advances in neural information processing systems, 35, 5775

    Lu, C., Zhou, Y., Bao, F., et al. 2022, Advances in neural information processing systems, 35, 5775

  24. [24]

    Dpm-solver++: Fast solver for guided sampling of diffusion probabilistic models.Machine Intelligence Re- search, 22(4):730–751, June 2025

    Lu, C., Zhou, Y., Bao, F., et al. 2025, Machine Intelligence Research, 22, 730, doi: 10.1007/s11633-025-1562-4

  25. [25]

    , keywords =

    McElwain, M. W., Feinberg, L. D., Perrin, M. D., et al. 2023, Publications of the Astronomical Society of the Pacific, 135, 058001, doi: 10.1088/1538-3873/acada0

  26. [26]

    Mudur, N., Cuesta-Lazaro, C., & Finkbeiner, D. P. 2023, arXiv preprint arXiv:2312.07534 15

  27. [27]

    D., Sivaramakrishnan, A., Lajoie, C.-P., et al

    Perrin, M. D., Sivaramakrishnan, A., Lajoie, C.-P., et al. 2014, in Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol. 9143, Space Telescopes and Instrumentation 2014: Optical, Infrared, and Millimeter Wave, ed. J. M. Oschmann, Jr., M. Clampin, G. G. Fazio, & H. A. MacEwen, 91433X, doi: 10.1117/12.2056689

  28. [28]

    2012, in Proceedings of SPIE, Vol

    Sivaramakrishnan, A. 2012, in Proceedings of SPIE, Vol. 8442, Space Telescopes and Instrumentation 2012: Optical, Infrared, and Millimeter Wave, ed. M. C. Clampin, G. G. Fazio, H. A. MacEwen, & J. Oschmann, Jacobus M., 84423D, doi: 10.1117/12.925230

  29. [29]

    K., Saavedra, P

    Riveros, J. K., Saavedra, P. A., Hort´ ua, H. J., Garc´ıa-Farieta, J. E., & Olier, I. 2025, Machine Learning: Science and Technology, 6, 035031

  30. [30]

    Ronneberger, O., Fischer, P., & Brox, T. 2015, in Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18, Springer, 234–241

  31. [31]

    Rowe, B. T. P., Jarvis, M., Mandelbaum, R., et al. 2015, Astronomy and Computing, 10, 121, doi: 10.1016/j.ascom.2015.02.002

  32. [32]

    2026, roman-galaxy-ddpm, Zenodo, doi: 10.5281/zenodo.19699521

    Hemmati, S. 2026, roman-galaxy-ddpm, Zenodo, doi: 10.5281/zenodo.19699521

  33. [33]

    2025, ApJ, 985, 2, doi: 10.3847/1538-4357/adcec4

    Hemmati, S. 2025, ApJ, 985, 2, doi: 10.3847/1538-4357/adcec4

  34. [34]

    J., Geach, J

    Smith, M. J., Geach, J. E., Jackson, R. A., et al. 2022, MNRAS, 511, 1808, doi: 10.1093/mnras/stac130

  35. [35]

    2015, in Proceedings of Machine Learning Research, Vol

    Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., & Ganguli, S. 2015, in Proceedings of Machine Learning Research, Vol. 37, Proceedings of the 32nd International Conference on Machine Learning, ed. F. Bach & D. Blei (Lille, France: PMLR), 2256–2265. https://proceedings.mlr.press/v37/sohl-dickstein15.html

  36. [36]

    Wide-Field InfrarRed Survey Telescope-Astrophysics Focused Telescope Assets WFIRST-AFTA 2015 Report

    Spergel, D., Gehrels, N., Baltay, C., et al. 2015, Wide-Field InfrarRed Survey Telescope-Astrophysics Focused Telescope Assets WFIRST-AFTA 2015 Report. https://arxiv.org/abs/1503.03757

  37. [37]

    A., Long, H., Hirata, C

    Troxel, M. A., Long, H., Hirata, C. M., et al. 2020, MNRAS, 501, 2044–2070, doi: 10.1093/mnras/staa3658 von Platen, P., Patil, S., Lozhkov, A., et al. 2022, Diffusers: State-of-the-art diffusion models, https://github.com/huggingface/diffusers, GitHub

  38. [38]

    2024, MNRAS, 528, 6680–6705, doi: 10.1093/mnras/stae177

    Yamamoto, M., Laliotis, K., Macbeth, E., et al. 2024, MNRAS, 528, 6680–6705, doi: 10.1093/mnras/stae177