arxiv: 2604.24425 · v1 · submitted 2026-04-27 · 🌌 astro-ph.CO

Recognition: unknown

Propagating data-driven galaxy redshift distribution uncertainties in 3times2-pt analyses

Jaime Ruiz-Zapatero , Qianjun Hang , Yun-Hao Zhang , Benjamin Joachimi , Joe Zuntz , Ian Harrison , Carlos Garc\'ia-Garc\'ia , Alex Malz

show 2 more authors

Benjamin St\"olzner The LSST Dark Energy Science Collaboration

Authors on Pith no claims yet

Pith reviewed 2026-05-08 01:27 UTC · model grok-4.3

classification 🌌 astro-ph.CO

keywords redshift distribution uncertaintiesn(z)3x2-pt analysesStage-IV surveysPCA modelscosmological parameter biasanalytical marginalisationweak lensing

0 comments

The pith

Stage-IV 3x2-pt analyses require PCA models for galaxy redshift distribution uncertainties to keep parameter biases low.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests four ways to propagate uncertainties in the radial distribution of galaxies n(z) through combined weak lensing and galaxy clustering measurements. Using large sets of simulated n(z) that include both random and systematic changes, it compares simple shift-and-stretch adjustments to more flexible Gaussian process and principal component analysis descriptions. The results indicate that the basic shift-and-stretch approach leaves noticeable biases in the matter density and fluctuation amplitude, while a five-component PCA description limits those biases to half the size at the cost of only a 5 percent widening in the S8 error bar. The work also demonstrates that all four uncertainty models can be removed from the likelihood by fast analytical approximations instead of full numerical sampling.

Core claim

Based on ensembles of simulated n(z) including stochastic and systematic variations, the four uncertainty models studied are shifts, shifts and stretches, Gaussian processes, and principal component analysis. Stage-IV 3x2-pt analyses must go beyond simple shift and stretch models, and PCA is advocated even for early Stage-IV surveys. A five-parameter PCA model degrades the constraint on S8 by 5 percent relative to the shift-and-stretch case while incurring half the bias in Omega_m and sigma_8. All models can be safely marginalised analytically, with speed-ups of up to a factor of 25 depending on model dimensionality.

What carries the argument

Principal component analysis (PCA) applied to ensembles of simulated n(z) distributions, together with gradient-based inference and approximate analytical marginalisation.

Load-bearing premise

The ensembles of simulated n(z) that include stochastic and systematic variations are representative of the true uncertainties present in real Stage-IV survey data.

What would settle it

A direct comparison of the bias levels in Omega_m and sigma_8 obtained from real survey data when using the PCA model versus the shift-and-stretch model, checked against independent cosmological probes.

Figures

Figures reproduced from arXiv: 2604.24425 by Alex Malz, Benjamin Joachimi, Benjamin St\"olzner, Carlos Garc\'ia-Garc\'ia, Ian Harrison, Jaime Ruiz-Zapatero, Joe Zuntz, Qianjun Hang, The LSST Dark Energy Science Collaboration, Yun-Hao Zhang.

**Figure 1.** Figure 1: The ensembles of redshift galaxy distributions from the CosmoDC2 catalogue produced by Zhang et al. (2026) for each tomographic bin for the lens (top row) and source samples (bottom row). Each ensemble contains the statistical uncertainties of the CosmoDC2 catalogue as given by the FlexZBoost algorithm as well as systematic uncertainties due to the incompleteness of the reference sample. spectra between th… view at source ↗

**Figure 2.** Figure 2: A comparison between the measured galaxy redshift distribution, 𝒏(𝒛), for the first tomographic bin of the lens sample by Zhang et al. (2026) and the 𝒏(𝒛) samples generated by different uncertainty models (shifts, shifts & stretches, PCAs and GPs) after being calibrated on the former. The upper panels show overlapping samples of the processes. The lower panels show the correlation matrices of the processes… view at source ↗

**Figure 3.** Figure 3: The posteriors for the galaxy distribution of each tomographic bin given by different uncertainty models. Upper panels show the lens sample bins. Lower panels show the source sample bins. The first row of panels in each block shows a direct comparison between the galaxy distributions obtained for each model. In the rows below we show the standard deviation on the 𝒏(𝒛) posterior obtained by every model cons… view at source ↗

**Figure 5.** Figure 5: The 1 & 2D marginalised constraints for the cosmological parameters 𝑆8 and ⟨𝑏𝑔 ⟩, the average galaxy bias across all tomographic bins, accounting for galaxy distribution uncertainties using different models. Black dashed contours correspond to when no model was considered (i.e. galaxy distribution uncertainties were not considered in the analysis). Blue contours correspond to the shifts (Δz) model posteri… view at source ↗

**Figure 6.** Figure 6: The standard deviation of the galaxy bias parameter of each tomographic bin in the lens sample. The labels on the horizontal axis correspond to the mean redshift of each tomographic bin. The different error bars represent the constraints from the different galaxy distribution uncertainty models considered in this work. Black lines correspond to when no model was considered (i.e. galaxy distribution uncer… view at source ↗

**Figure 7.** Figure 7: Reduced 𝜒 2 distributions obtained when assuming a given galaxy distribution uncertainty model to fit noiseless data vectors based on the samples in the 𝒏(𝒛) calibration ensemble for shared cosmology. Thus the 𝜒 2 distributions shown represent the error incurred by assuming a given model. Models with a higher error will lead to a higher bias in the cosmology parameters. The black dashed line corresponds … view at source ↗

**Figure 8.** Figure 8: Square root bias over standard deviation as a percentage induced on the cosmological parameters Ωm (bottom panel) and 𝜎8 (top panel) purely due to the choice of 𝒏(𝒛) uncertainty model. The bias measurement was obtained using a linear approximation to fit theory vectors generated using 𝒏(𝒛) samples from the calibration ensemble using theory vectors generated based on the 𝒏(𝒛) given by each model. The standa… view at source ↗

**Figure 9.** Figure 9: The 1 & 2D marginalised constraints for the cosmological parameters Ωm, 𝜎8 and 𝑆8, accounting for 𝒏(𝒛) uncertainties using different models and different marginalisations techniques. In each sub-panel three different contours are compared. First, a set of black dashed contours show the constraints obtained when the parameters of the associated model were kept fixed. Second, the two sets of coloured contour… view at source ↗

**Figure 10.** Figure 10: A comparison between the 1D marginal 𝑆8 distributions obtained when considering different galaxy redshift distribution uncertainty models with respect the case where no model is considered. We show constraints for when a shift model (blue), a shift & stretch model (yellow), a PCA model (green) and a GP model (orange). Full opacity bars correspond to numerical constraints while half opacity bars correspond… view at source ↗

read the original abstract

Uncertainties in the radial distribution of galaxies, $\boldsymbol{n}(\boldsymbol{z})$, are one of the major contributions to the error budget of early Stage-IV galaxy survey analyses of weak gravitational lensing, galaxy clustering and galaxy-galaxy lensing (3$\times$2-pt). Based on ensembles of simulated $\boldsymbol{n}(\boldsymbol{z})$ including stochastic and systematic variations, we study the impact of four different $\boldsymbol{n}(\boldsymbol{z})$ uncertainty models: shifts, shifts & stretches, Gaussian processes (GP) and principal component analysis (PCA). Due to the high dimensionality of the latter models, we make use of state-of-the-art gradient-based inference methods as well as approximate analytical marginalisation schemes. Our results show that Stage-IV 3$\times$2-pt analyses must go beyond simple shift & stretch models. In particular, we advocate for the adoption of PCA models even in early Stage-IV surveys. Our results show that considering a five-parameters PCA model only degrades the constraint on the $S_{\rm 8}$ parameter by $5$ per cent with respect to the case when only a shift and a stretch parameter are included, while incurring half the bias in its constituents parameters, $\Omega_{\rm m}$ and $\sigma_{\rm 8}$. We demonstrate that all models considered can be safely marginalised analytically, with speed-ups of up to a factor of 25 depending on the dimensionality of the model. This will allow Stage-IV analyses to safely include higher-dimensional $\boldsymbol{n}(\boldsymbol{z})$ uncertainty models in their analysis at negligible additional computational cost.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives concrete numbers on the cost of moving from shift-and-stretch to PCA models for n(z) uncertainties in 3x2pt work, plus evidence that analytical marginalization keeps the computation cheap.

read the letter

The main thing to know is that a 5-parameter PCA model for redshift distribution uncertainties degrades the S8 constraint by only 5% relative to a simple shift-and-stretch model while halving the bias on Omega_m and sigma_8, and all the models can be analytically marginalized with speed-ups up to a factor of 25. They reach this by running ensembles of simulated n(z) that include both stochastic and systematic variations, then comparing four uncertainty models with gradient-based inference for the higher-dimensional cases and testing analytical marginalization schemes across them. The quantitative trade-off numbers and the demonstration that analytical methods work at scale are the useful parts; they turn a general warning about photo-z errors into something that can be plugged into an error budget or pipeline design. The simulations are forward-modeled and the marginalization tests are direct, so there is no obvious circularity in the results. The soft spot is the representativeness of those simulated ensembles. Everything rests on how well the added stochastic and systematic variations match the actual correlation structure and unmodeled effects in real Stage-IV photometric data. If the real uncertainties have higher effective dimensionality or different systematics, the claimed advantage of PCA over simpler models could shrink. The abstract does not show direct validation against existing survey data, which leaves that as the main open question. This is aimed at people building or reviewing 3x2pt analyses for early Stage-IV surveys who need to choose an n(z) uncertainty model and know the computational price. A reader setting up an LSST or Euclid pipeline would get immediate value from the numbers and the marginalization tricks. It deserves a serious referee because the quantitative results and the practical computational finding are directly relevant to current planning, even if additional real-data checks would strengthen the case. I would send it to review.

Referee Report

1 major / 2 minor

Summary. The paper studies the impact of galaxy redshift distribution n(z) uncertainties on 3×2-pt cosmological analyses for Stage-IV surveys. Using ensembles of simulated n(z) that incorporate stochastic and systematic variations, it compares four uncertainty models (shifts, shifts+stretches, Gaussian processes, and PCA) and employs gradient-based inference plus analytical marginalization for the higher-dimensional cases. The central result is that simple shift & stretch models are insufficient; a 5-parameter PCA model degrades the S8 constraint by only 5% relative to shift+stretch while halving the bias on Ωm and σ8, and all models can be analytically marginalized with speed-ups up to a factor of 25.

Significance. If the simulated ensembles are representative, the work supplies actionable guidance for n(z) modeling in upcoming surveys by quantifying the bias–precision trade-off and demonstrating that higher-dimensional models remain computationally tractable via analytical marginalization. The explicit use of gradient-based sampling and closed-form marginalization schemes for PCA and GP models is a clear methodological strength that enables the dimensionality study.

major comments (1)

[§3 and abstract] §3 (Simulation of n(z) ensembles) and abstract: The quantitative recommendation that Stage-IV analyses 'must go beyond simple shift & stretch models' and should adopt 5-parameter PCA even in early surveys rests entirely on performance metrics derived from the simulated ensembles. The reported 5% S8 degradation and factor-of-two bias reduction on Ωm/σ8 are specific to the included stochastic and systematic variations; without additional tests (e.g., sensitivity to different correlation lengths, comparison against real DES/HSC n(z) residuals, or injection of unmodeled systematics), it is unclear whether the relative advantage of PCA persists for actual Stage-IV photometric-redshift uncertainties.

minor comments (2)

[§2] Notation: the boldface vector notation for n(z) is introduced in the abstract but should be defined once in §2 to avoid any ambiguity between the distribution and its binned representation.
[Results figures] Figure clarity: the panels comparing bias and variance across models would benefit from explicit error bars on the reported 5% and factor-of-two figures so readers can judge statistical significance of the differences.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive review and for highlighting the importance of assessing the robustness of our simulated ensembles. We have revised the manuscript to incorporate additional sensitivity tests, clarify the scope of our simulations, and moderate the language of our recommendations in the abstract and §3. These changes preserve the core results while addressing the generality concerns.

read point-by-point responses

Referee: [§3 and abstract] §3 (Simulation of n(z) ensembles) and abstract: The quantitative recommendation that Stage-IV analyses 'must go beyond simple shift & stretch models' and should adopt 5-parameter PCA even in early surveys rests entirely on performance metrics derived from the simulated ensembles. The reported 5% S8 degradation and factor-of-two bias reduction on Ωm/σ8 are specific to the included stochastic and systematic variations; without additional tests (e.g., sensitivity to different correlation lengths, comparison against real DES/HSC n(z) residuals, or injection of unmodeled systematics), it is unclear whether the relative advantage of PCA persists for actual Stage-IV photometric-redshift uncertainties.

Authors: We agree that the quantitative metrics (5% S8 degradation and halved bias on Ωm/σ8) are specific to the simulated n(z) ensembles in §3, which incorporate stochastic variations with correlation lengths motivated by typical photo-z uncertainties and systematic shifts drawn from known photometric redshift biases. To test robustness, we have added new sensitivity analyses in §4.2 and Appendix C in which the stochastic correlation length is varied by factors of 0.5 and 2.0; the PCA model continues to limit S8 degradation to ≤7% while reducing Ωm/σ8 bias by at least 40% relative to shift+stretch. We have also expanded the discussion in §5 to compare the amplitude and structure of our simulated variations against published n(z) uncertainty estimates from DES and HSC, noting that our ensembles bracket the reported residual levels. Direct injection of additional unmodeled systematics beyond those already included (mean/width shifts plus higher-order modes) would require new ensemble generation; however, the PCA basis already captures a broad range of such modes by construction. We have revised the abstract and the final paragraph of §3 to replace prescriptive phrasing (“must go beyond”) with “our simulations indicate that analyses should consider models beyond simple shift and stretch,” thereby qualifying the recommendation to the tested ensemble class. revision: yes

Circularity Check

0 steps flagged

No significant circularity; results from direct simulation ensembles and marginalization tests

full rationale

The paper derives its central claims (need to go beyond shift/stretch models, advocacy for 5-param PCA in early Stage-IV) from forward simulations of n(z) ensembles that include stochastic/systematic variations, followed by explicit marginalization tests across four uncertainty models using gradient-based inference and analytical approximations. No load-bearing step reduces by construction to a fitted parameter renamed as prediction, no self-citation chain justifies uniqueness or ansatz, and no derivation equates to its inputs. The quantified trade-offs (5% S8 degradation, halved bias on Om/sigma8) are direct outputs of the simulation pipeline rather than tautological. The representativeness assumption is a standard modeling choice, not a circular reduction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The study rests on the assumption that simulated n(z) ensembles capture real survey uncertainties; no new physical entities are introduced.

free parameters (1)

PCA model dimensionality
Five components chosen to represent higher-dimensional uncertainty while remaining computationally tractable.

axioms (1)

domain assumption Simulated n(z) ensembles with stochastic and systematic variations are representative of real data uncertainties.
This assumption underpins all model comparisons and bias estimates.

pith-pipeline@v0.9.0 · 5634 in / 1248 out tokens · 47166 ms · 2026-05-08T01:27:48.202234+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages · 1 internal anchor

[1]

Dimensional reduction for sampled priors and application to photometric redshift distributions

Astropy Collaboration et al., 2013, A&A, 558, A33 Astropy Collaboration et al., 2018, aj, 156, 123 Baum W. A., 1957, AJ, 62, 6 Benítez N., 2000, ApJ, 536, 571 Bernstein G., et al., 2025, arXiv e-prints, p. arXiv:2506.00758 Betancourt M., 2017, arXiv e-prints, p. arXiv:1701.02434 Bridle S., King L., 2007, New Journal of Physics, 9, 444 Campagne J.-E., et a...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1007/b98835 2013
[2]

M., et al., 2023, MNRAS, 524, 5109 Revels J., Lubin M., Papamarkou T., 2016, arXiv:1607.07892 [cs.MS] Robnik J., Seljak U., 2023, arXiv e-prints, p

Cambridge, MA 02142 Rau M. M., et al., 2023, MNRAS, 524, 5109 Revels J., Lubin M., Papamarkou T., 2016, arXiv:1607.07892 [cs.MS] Robnik J., Seljak U., 2023, arXiv e-prints, p. arXiv:2303.18221 Robnik J., De Luca G. B., Silverstein E., Seljak U., 2022, arXiv e-prints, p. arXiv:2212.08549 Ruiz-Zapatero J., Hadzhiyska B., Alonso D., Ferreira P. G., García-Ga...

work page arXiv 2023