Filtering Interlopers with Photometry and Diagnostic Features: A Machine Learning Framework Validated with CSST Slitless Spectroscopy

Cheng Li; Gongbo Zhao; Hong Guo; Hui Peng; Hu Zhan; Hu Zou; Jipeng Sui; Pengjie Zhang; Run Wen; Xian Zhong Zheng

arxiv: 2601.03883 · v2 · pith:XXOIFSNTnew · submitted 2026-01-07 · 🌌 astro-ph.CO

Filtering Interlopers with Photometry and Diagnostic Features: A Machine Learning Framework Validated with CSST Slitless Spectroscopy

Hui Peng , Yu Yu , Yiyang Guo , Yizhou Gu , Run Wen , Yunkun Han , Jipeng Sui , Hu Zou

show 8 more authors

Xiaohu Yang Pengjie Zhang Xian Zhong Zheng Hong Guo Yipeng Jing Cheng Li Hu Zhan Gongbo Zhao

This is my paper

Pith reviewed 2026-05-21 16:08 UTC · model grok-4.3

classification 🌌 astro-ph.CO

keywords slitless spectroscopyinterloper galaxiesXGBoostredshift estimationmachine learningCSST surveyphotometric diagnosticsspectroscopic features

0 comments

The pith

An XGBoost classifier using photometry and spectroscopic diagnostics filters interlopers in CSST slitless spectroscopy to retain galaxies with 96.6 percent accurate redshifts and 0.13 percent outliers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses the challenge of emission-line misidentification in slitless spectroscopic surveys like CSST, which introduces interloper galaxies and contaminates redshift samples. Traditional strict selection cuts achieve high purity only by discarding most of the data and lowering completeness. The authors train an XGBoost classifier on photometric properties combined with spectroscopic diagnostic features to select a cleaner subsample. On simulated CSST data covering about 62 million galaxies, the classifier keeps roughly 42 percent of the parent sample while ensuring 96.6 percent of those selected have accurate redshifts defined as |Δz| ≤ 0.002(1+z). This yields an outlier fraction of just 0.13 percent, a clear improvement over configurations that drop either photometry or diagnostics.

Core claim

The central claim is that an XGBoost classifier trained on photometric properties and spectroscopic diagnostic features can construct a high-purity redshift catalog from slitless spectroscopy. Validated on a simulated sample generated by the CSST emulator, the classifier selects galaxies with 42.3 percent efficiency on the test set. Among the retained galaxies 96.6 percent achieve accurate measurements with |Δz| ≤ 0.002(1+z), while the outlier fraction with |Δz| > 0.01(1+z) is held to 0.13 percent. Models that omit spectroscopic diagnostics raise the outlier fraction by a factor of about 3.5, and models that omit photometry raise it by a factor of about 6.3 while also introducing notable cat

What carries the argument

XGBoost classifier that combines photometric properties with spectroscopic diagnostic features to identify galaxies likely to have correct redshift measurements

If this is right

Among retained galaxies, 96.6 percent achieve accurate measurements with |Δz| ≤ 0.002(1+z).
The outlier fraction with |Δz| > 0.01(1+z) is constrained to 0.13 percent.
The classifier maintains a selection efficiency of 42.2 percent when deployed on the full parent sample of galaxies with valid redshifts.
Excluding spectroscopic diagnostics raises the outlier fraction by a factor of roughly 3.5.
Excluding photometric data raises the outlier fraction by a factor of roughly 6.3 and introduces notable catastrophic interloper contamination.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same combination of photometry and diagnostics could be adapted to improve interloper rejection in other slitless surveys such as Euclid.
Further gains in purity might be obtained by testing ensemble methods or adding higher-order spectral summary statistics not used in the current classifier.
The results underscore the benefit of multi-modal feature fusion for controlling systematic errors in large cosmological datasets.

Load-bearing premise

The simulated spectra generated by the CSST emulator accurately reproduce the noise properties, line misidentification rates, and photometric-spectroscopic correlations that will occur in actual CSST observations.

What would settle it

Applying the trained classifier to real CSST slitless spectroscopy observations and then measuring the fraction of galaxies with |Δz| ≤ 0.002(1+z) and the fraction with |Δz| > 0.01(1+z) via cross-matches to independent high-resolution redshift surveys would test whether the reported accuracy and outlier rates hold.

read the original abstract

The slitless spectroscopic method employed by missions such as Euclid and the Chinese Space-station Survey Telescope (CSST) faces a fundamental challenge: spectroscopic redshifts derived from their data are susceptible to emission-line misidentification due to the limited spectral resolution and signal-to-noise ratio. This effect systematically introduces interloper galaxies into the sample. Conventional strict selection not only struggles to secure high redshift purity but also drastically reduces completeness by discarding valuable data. To overcome this limitation, we develop an XGBoost classifier that leverages photometric properties and spectroscopic diagnostics to construct a high-purity redshift catalog while maximizing completeness. We validate this method on a simulated sample with spectra generated by the CSST emulator for slitless spectroscopy. Of the $\sim$62 million galaxies that obtain valid redshifts (parent sample), approximately 43% achieve accurate measurements, defined as $|\Delta z| \leqslant 0.002(1+z)$. From this parent sample, the XGBoost classifier selects galaxies with a selection efficiency of 42.3% on the test set and 42.2% when deployed on the entire parent sample. Crucially, among the retained galaxies, 96.6% (parent sample: 96.5%) achieve accurate measurements, while the outlier fraction ($|\Delta z|>0.01(1+z)$) is constrained to 0.13% (0.11%). We verified that simplified configurations that exclude either spectroscopic diagnostics (except the measured redshift) or photometric data yield significantly higher outlier fractions, increasing by factors of approximately 3.5 and 6.3, respectively, with the latter case also introducing notable catastrophic interloper contamination. This framework effectively resolves the purity-completeness trade-off, enabling robust large-scale cosmological studies with CSST and similar surveys.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

XGBoost on CSST emulator spectra keeps 42% of galaxies while hitting 96.6% accurate redshifts and 0.13% outliers, with ablations showing clear gains from photometry plus diagnostics, but everything rests on untested simulation fidelity.

read the letter

The main point is that this XGBoost setup, trained on photometric properties and spectroscopic diagnostics from the CSST emulator, selects a subset where 96.6% of retained galaxies have accurate redshifts and outliers fall to 0.13%, all while retaining roughly 42% of the 62 million galaxy parent sample. The numbers hold on both the test set and the full simulated parent sample.

Referee Report

2 major / 2 minor

Summary. The paper develops an XGBoost classifier that combines photometric properties and spectroscopic diagnostics to filter emission-line interlopers in CSST slitless spectroscopy. On a simulated parent sample of ~62 million galaxies, it reports a selection efficiency of 42.3% (42.2% on full sample) while retaining 96.6% (96.5%) galaxies with accurate redshifts (|Δz| ≤ 0.002(1+z)) and limiting outliers (|Δz| > 0.01(1+z)) to 0.13% (0.11%). Ablation tests show that removing diagnostics or photometry increases outliers by factors of ~3.5 and ~6.3, respectively.

Significance. If the CSST emulator faithfully reproduces real noise, line misidentification rates, and photo-spectroscopic correlations, the framework offers a practical route to higher-purity redshift catalogs without the severe completeness loss of traditional cuts. The large simulated sample size, held-out test metrics, and explicit ablation baselines against photometry-only and diagnostics-only variants constitute clear strengths and make the performance gains falsifiable within the simulation framework.

major comments (2)

[Abstract and results section] Abstract and results section: the quoted purity (96.6% accurate, 0.13% outliers) and the factor-of-3.5/6.3 degradation in ablations are obtained exclusively inside the CSST emulator. The manuscript does not provide a quantitative assessment of how well the emulator reproduces the actual noise power spectrum, continuum subtraction residuals, or emission-line confusion rates expected in flight data; this assumption is load-bearing for the claim that the classifier will deliver comparable performance on real CSST observations.
[Methods/validation] Methods/validation: the parent sample is defined as galaxies that obtain valid redshifts from the emulator; it is unclear how the fraction of galaxies that fail to yield any redshift in the first place (and are therefore excluded before the classifier is applied) compares to real CSST data, which directly affects the effective completeness of the final catalog.

minor comments (2)

[Figures and text] Figure captions and text should explicitly state the exact definitions of “accurate” and “outlier” (including the precise (1+z) normalization) each time the numbers 96.6% and 0.13% are quoted.
[Methods] The XGBoost hyper-parameter values and training/validation split ratios are mentioned but not tabulated; a short supplementary table would improve reproducibility.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive comments. We address each major point below, clarifying the simulation-based nature of the study and adding explicit caveats where appropriate. Revisions will be made to the abstract, methods, and discussion sections.

read point-by-point responses

Referee: [Abstract and results section] Abstract and results section: the quoted purity (96.6% accurate, 0.13% outliers) and the factor-of-3.5/6.3 degradation in ablations are obtained exclusively inside the CSST emulator. The manuscript does not provide a quantitative assessment of how well the emulator reproduces the actual noise power spectrum, continuum subtraction residuals, or emission-line confusion rates expected in flight data; this assumption is load-bearing for the claim that the classifier will deliver comparable performance on real CSST observations.

Authors: We agree that all reported metrics, including purity, outlier rates, and ablation results, are derived exclusively from the CSST emulator simulations. As CSST has not yet begun flight operations, no real data exists for direct quantitative validation of noise properties or line confusion rates. The emulator follows published CSST instrumental specifications and has been cross-checked against expected performance in prior CSST design studies. We will revise the abstract and add a dedicated paragraph in the discussion section to explicitly state the simulation-only scope, reference the emulator's documented assumptions, and note that real-data validation will be required once observations are available. revision: yes
Referee: [Methods/validation] Methods/validation: the parent sample is defined as galaxies that obtain valid redshifts from the emulator; it is unclear how the fraction of galaxies that fail to yield any redshift in the first place (and are therefore excluded before the classifier is applied) compares to real CSST data, which directly affects the effective completeness of the final catalog.

Authors: The parent sample is defined as galaxies that receive a valid redshift from the emulator because the classifier operates on sources that already have a spectroscopic measurement; galaxies without any redshift are excluded prior to interloper filtering. This definition is standard for post-processing purity improvements. We will expand the methods section to clarify this boundary, explain that the reported selection efficiency is conditional on redshift success, and note that the overall catalog completeness is the product of the initial detection rate and our 42% selection efficiency. A direct numerical comparison of failure fractions to real CSST data cannot be performed at present. revision: partial

standing simulated objections not resolved

Direct quantitative assessment of emulator fidelity to real CSST flight data (noise spectrum, continuum residuals, line confusion) and comparison of redshift failure rates, as no flight observations are yet available.

Circularity Check

0 steps flagged

No circularity: performance metrics computed on held-out test data from independent simulation

full rationale

The paper trains an XGBoost classifier on simulated CSST spectra and reports selection efficiency, accuracy, and outlier rates on a separate test set plus the full parent sample. These quantities are direct empirical measurements on data not used for training; no equation or parameter is fitted to the target purity metric and then re-labeled as a prediction. No self-citation is invoked as a uniqueness theorem or load-bearing premise, and the central result does not reduce by construction to its own inputs. The simulation-fidelity assumption is an external-validity concern, not a circularity in the derivation chain.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central performance claims rest on the fidelity of the CSST emulator simulation and on the choice of input features and model hyperparameters; these are not derived from first principles within the paper.

free parameters (1)

XGBoost hyperparameters
Learning rate, maximum depth, number of estimators and similar tuning choices are required to train the classifier but are not reported in the abstract.

axioms (1)

domain assumption The CSST emulator produces realistic slitless spectra that include the correct noise, resolution, and emission-line misidentification statistics.
All reported purity and completeness numbers are measured on data generated by this emulator.

pith-pipeline@v0.9.0 · 5916 in / 1361 out tokens · 74733 ms · 2026-05-21T16:08:15.762068+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We develop an XGBoost classifier that leverages photometric properties and spectroscopic diagnostics... validated on a simulated sample with spectra generated by the CSST emulator
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

selection efficiency of 42.3%... 96.6% achieve accurate measurements... outlier fraction constrained to 0.13%

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · 11 internal anchors

[1]

Data Release 1 of the Dark Energy Spectroscopic Instrument

DESI Collaboration, M. Abdul-Karim, A.G. Adame, D. Aguado, J. Aguilar, S. Ahlen et al., Data Release 1 of the Dark Energy Spectroscopic Instrument,arXiv e-prints(2025) arXiv:2503.14745 [2503.14745]

work page internal anchor Pith review Pith/arXiv arXiv 2025
[2]

Mellier, Abdurro’uf, J

Euclid Collaboration, Y. Mellier, Abdurro’uf, J.A. Acevedo Barroso, A. Ach´ ucarro, J. Adamek et al.,Euclid: I. Overview of the Euclid mission,Astron. and Astrophys.697(2025) A1 [2405.13491]

work page arXiv 2025
[3]

Bock, A.M

J.J. Bock, A.M. Aboobaker, J. Adamo, R. Akeson, J.M. Alred, F. Alibay et al.,The SPHEREx Satellite Mission,arXiv e-prints(2025) arXiv:2511.02985 [2511.02985]

work page arXiv 2025
[4]

Gebhardt , E

K. Gebhardt, E. Mentuch Cooper, R. Ciardullo, V. Acquaviva, R. Bender, W.P. Bowman et al.,The Hobby-Eberly Telescope Dark Energy Experiment (HETDEX) Survey Design, Reductions, and Detections,Astrophys. J.923(2021) 217 [2110.04298]

work page arXiv 2021
[5]

Wide-Field InfrarRed Survey Telescope-Astrophysics Focused Telescope Assets WFIRST-AFTA 2015 Report

D. Spergel, N. Gehrels, C. Baltay, D. Bennett, J. Breckinridge, M. Donahue et al.,Wide-Field InfrarRed Survey Telescope-Astrophysics Focused Telescope Assets WFIRST-AFTA 2015 Report,arXiv e-prints(2015) arXiv:1503.03757 [1503.03757]

work page internal anchor Pith review Pith/arXiv arXiv 2015
[6]

CSST Collaboration, Y. Gong, H. Miao, H. Zhan, Z.-Y. Li, J. Shangguan et al.,Introduction to the China Space Station Telescope (CSST),arXiv e-prints(2025) arXiv:2507.04618 [2507.04618]

work page arXiv 2025
[7]

Wen, X.Z

R. Wen, X.Z. Zheng, Y. Han, X. Yang, X. Wang, H. Zou et al.,CSST large-scale structure analysis pipeline: II. The CSST Emulator for Slitless Spectroscopy,Mon. Not. Roy. Astron. Soc.528(2024) 2770 [2401.04171]

work page arXiv 2024
[8]

Copin, M

Euclid Collaboration, Y. Copin, M. Fumana, C. Mancini, P.N. Appleton, R. Chary et al., Euclid Quick Data Release (Q1): From spectrograms to spectra: the SIR spectroscopic Processing Function,arXiv e-prints(2025) arXiv:2503.15307 [2503.15307]

work page arXiv 2025
[9]

Le Brun, M

Euclid Collaboration, V. Le Brun, M. Bethermin, M. Moresco, D. Vibert, D. Vergani et al., Euclid Quick Data Release (Q1) – Characteristics and limitations of the spectroscopic measurements,arXiv e-prints(2025) arXiv:2503.15308 [2503.15308]

work page arXiv 2025
[10]

Monaco, M.Y

Euclid Collaboration, P. Monaco, M.Y. Elkhashab, B.R. Granett, J. Salvalaggio, E. Sefusatti et al.,Euclid preparation. Controlling angular systematics in the Euclid spectroscopic galaxy sample,arXiv e-prints(2025) arXiv:2511.20856 [2511.20856]

work page arXiv 2025
[11]

Interloper bias in future large-scale structure surveys

A.R. Pullen, C.M. Hirata, O. Dor´ e and A. Raccanelli,Interloper bias in future large-scale structure surveys,Publ. Astron. Soc. Jap.68(2016) 12 [1507.05092]. – 13 –

work page internal anchor Pith review Pith/arXiv arXiv 2016
[12]

Bayesian Redshift Classification of Emission-line Galaxies with Photometric Equivalent Widths

A.S. Leung, V. Acquaviva, E. Gawiser, R. Ciardullo, E. Komatsu, A.I. Malz et al.,Bayesian Redshift Classification of Emission-line Galaxies with Photometric Equivalent Widths, Astrophys. J.843(2017) 130 [1510.07043]

work page internal anchor Pith review Pith/arXiv arXiv 2017
[13]

The Impact of Line Misidentification on Cosmological Constraints from Euclid and other Spectroscopic Galaxy Surveys

G.E. Addison, C.L. Bennett, D. Jeong, E. Komatsu and J.L. Weiland,The Impact of Line Misidentification on Cosmological Constraints from Euclid and Other Spectroscopic Galaxy Surveys,Astrophys. J.879(2019) 15 [1811.10668]

work page internal anchor Pith review Pith/arXiv arXiv 2019
[14]

Unbiased Cosmological Parameter Estimation from Emission Line Surveys with Interlopers

H.S. Grasshorn Gebhardt, D. Jeong, H. Awan, J.S. Bridge, R. Ciardullo, D. Farrow et al., Unbiased Cosmological Parameter Estimation from Emission-line Surveys with Interlopers, Astrophys. J.876(2019) 32 [1811.06982]

work page internal anchor Pith review Pith/arXiv arXiv 2019
[15]

Awan and E

H. Awan and E. Gawiser,Angular Correlation Function Estimators Accounting for Contamination from Probabilistic Distance Measurements,Astrophys. J.890(2020) 78 [1911.07832]

work page arXiv 2020
[16]

Massara, S

E. Massara, S. Ho, C.M. Hirata, J. DeRose, R.H. Wechsler and X. Fang,Line confusion in spectroscopic surveys and its possible effects: shifts in Baryon Acoustic Oscillations position, Mon. Not. Roy. Astron. Soc.508(2021) 4193 [2010.00047]

work page arXiv 2021
[17]

Hilmi, N

M. Hilmi, N. Leethochawalit, M. Trenti and B. Metha,A novel analysis of contamination in Lyman-break galaxy samples at z 6-8: spatial correlation with intermediate-redshift galaxies at z 1.3-2,Mon. Not. Roy. Astron. Soc.532(2024) 920 [2501.00301]

work page arXiv 2024
[18]

Risso, A

Euclid Collaboration, I. Risso, A. Veropalumbo, E. Branchini, E. Maragliano, S. de la Torre et al.,Euclid preparation. The impact of redshift interlopers on the two-point correlation function analysis,arXiv e-prints(2025) arXiv:2505.04688 [2505.04688]

work page arXiv 2025
[19]

J. Sui, H. Zou, X. Yang, X. Zheng, R. Wen, Y. Gu et al.,CSST large scale structure analysis pipeline: III. Emission-line redshift measurement for slitless spectra,Mon. Not. Roy. Astron. Soc.538(2025) 395 [2502.11536]

work page arXiv 2025
[20]

Cagliari, B.R

M.S. Cagliari, B.R. Granett, L. Guzzo, M. Bethermin, M. Bolzonella, S. de la Torre et al., Euclid: Testing photometric selection of emission-line galaxy targets,Astron. and Astrophys. 689(2024) A166 [2403.08726]

work page arXiv 2024
[21]

The DEEP2 Galaxy Redshift Survey: Redshift Identification of Single-Line Emission Galaxies

E.N. Kirby, P. Guhathakurta, S.M. Faber, D.C. Koo, B.J. Weiner and M.C. Cooper,The DEEP2 Galaxy Redshift Survey: Redshift Identification of Single-Line Emission Galaxies, Astrophys. J.660(2007) 62 [astro-ph/0701747]

work page internal anchor Pith review Pith/arXiv arXiv 2007
[22]

Davis, K

D. Davis, K. Gebhardt, E.M. Cooper, R. Ciardullo, M. Fabricius, D.J. Farrow et al.,The HETDEX Survey Emission-line Exploration and Source Classification,Astrophys. J.946 (2023) 86 [2301.01799]

work page arXiv 2023
[23]

Farrow, A.G

D.J. Farrow, A.G. S´ anchez, R. Ciardullo, E.M. Cooper, D. Davis, M. Fabricius et al., Correcting correlation functions for redshift-dependent interloper contamination,Mon. Not. Roy. Astron. Soc.507(2021) 3187 [2104.04613]

work page arXiv 2021
[24]

Y. Gong, H. Miao, P. Zhang and X. Chen,Self-calibrating Interloper Bias in Spectroscopic Galaxy-clustering Surveys,Astrophys. J.919(2021) 12 [2107.04745]

work page arXiv 2021
[25]

Foroozan, E

S. Foroozan, E. Massara and W.J. Percival,Correcting for small-displacement interlopers in BAO analyses,JCAP.2022(2022) 072 [2208.05001]

work page arXiv 2022
[26]

Peng and Y

H. Peng and Y. Yu,Precise self-calibration of interloper bias in spectroscopic surveys,Mon. Not. Roy. Astron. Soc.526(2023) 820 [2305.10487]

work page arXiv 2023
[27]

Nguyen, E

A.B.H. Nguyen, E. Massara and W.J. Percival,Self-calibrating BAO measurements in the presence of small displacement interlopers,JCAP.2024(2024) 008 [2311.14210]

work page arXiv 2024
[28]

Bernal and A

J.L. Bernal and A. Baleato Lizancos,Removal of interloper contamination to line-intensity maps using correlations with ancillary tracers of the large-scale structure,Phys. Rev. D.111 (2025) 043539 [2406.12979]. – 14 –

work page arXiv 2025
[29]

Massara, F

E. Massara, F. Villaescusa-Navarro and W.J. Percival,Predicting interloper fraction with graph neural networks,JCAP.2023(2023) 012 [2309.05850]

work page arXiv 2023
[30]

Cagliari, A

M.S. Cagliari, A. Moradinezhad Dizgah and F. Villaescusa-Navarro,Correcting for Interloper Contamination in the Power Spectrum with Neural Networks,Astrophys. J.991(2025) 48 [2504.06919]

work page arXiv 2025
[31]

H. Zhan,The wide-field multiband imaging and slitless spectroscopy survey to be carried out by the survey space telescope of china manned space program,Chinese Science Bulletin66(2021) 1290

work page 2021
[32]

Y. Cao, Y. Gong, X.-M. Meng, C.K. Xu, X. Chen, Q. Guo et al.,Testing photometric redshift measurements with filter definition of the Chinese Space Station Optical Survey (CSS-OS), Mon. Not. Roy. Astron. Soc.480(2018) 2178 [1706.09586]

work page internal anchor Pith review Pith/arXiv arXiv 2018
[33]

Y. Gong, H. Miao, X. Zhou, Q. Xiong, Y. Song, Y. Jiang et al.,Future cosmology: New physics and opportunity from the China Space Station Telescope (CSST),Science China Physics, Mechanics, and Astronomy68(2025) 280402 [2501.15023]

work page arXiv 2025
[34]

Y. Gu, X. Yang, J. Han, Y. Wang, Q. Li, Z. Tan et al.,CSST large-scale structure analysis pipeline: I. Constructing reference mock galaxy redshift surveys,Mon. Not. Roy. Astron. Soc. 529(2024) 4015 [2403.10754]

work page arXiv 2024
[35]

J. Han, M. Li, W. Jiang, Z. Chen, H. Wang, C. Wei et al.,The Jiutian simulations for the CSST extra-galactic surveys,Science China Physics, Mechanics, and Astronomy68(2025) 109511 [2503.21368]

work page arXiv 2025
[36]

Overview of the DESI Legacy Imaging Surveys

A. Dey, D.J. Schlegel, D. Lang, R. Blum, K. Burleigh, X. Fan et al.,Overview of the DESI Legacy Imaging Surveys,Astron. J.157(2019) 168 [1804.08657]

work page internal anchor Pith review Pith/arXiv arXiv 2019
[37]

X. Yang, H. Xu, M. He, Y. Gu, A. Katsianis, J. Meng et al.,An Extended Halo-based Group/Cluster Finder: Application to the DESI Legacy Imaging Surveys DR8,Astrophys. J. 909(2021) 143 [2012.14998]

work page arXiv 2021
[38]

X. Zhou, Y. Gong, X. Zhang, N. Li, X.-M. Meng, X. Chen et al.,Accurately Estimating Redshifts from CSST Slitless Spectroscopic Survey Using Deep Learning,Astrophys. J.977 (2024) 69 [2407.13991]

work page arXiv 2024
[39]

XGBoost: A Scalable Tree Boosting System

T. Chen and C. Guestrin,XGBoost: A Scalable Tree Boosting System,arXiv e-prints(2016) arXiv:1603.02754 [1603.02754]

work page internal anchor Pith review Pith/arXiv arXiv 2016
[40]

Shwartz-Ziv and A

R. Shwartz-Ziv and A. Armon,Tabular Data: Deep Learning is Not All You Need,Information Fusion81(2022) 84 [2106.03253]

work page arXiv 2022
[41]

doi:10.48550/arXiv.2207.08815 , urldate =

L. Grinsztajn, E. Oyallon and G. Varoquaux,Why do tree-based models still outperform deep learning on tabular data?,arXiv e-prints(2022) arXiv:2207.08815 [2207.08815]

work page arXiv 2022
[42]

A Unified Approach to Interpreting Model Predictions

S. Lundberg and S.-I. Lee,A Unified Approach to Interpreting Model Predictions,arXiv e-prints(2017) arXiv:1705.07874 [1705.07874]

work page internal anchor Pith review Pith/arXiv arXiv 2017
[43]

Mentuch Cooper, K

E. Mentuch Cooper, K. Gebhardt, D. Davis, D.J. Farrow, C. Liu, G. Zeimann et al.,HETDEX Public Source Catalog 1: 220 K Sources Including Over 50 K LyαEmitters from an Untargeted Wide-area Spectroscopic Survey,Astrophys. J.943(2023) 177 [2301.01826]

work page arXiv 2023
[44]

Wei, G.-L

C.-L. Wei, G.-L. Li, Y.-D. Fang, X. Zhang, Y. Luo, H. Tian et al.,Mock Observations for the CSST Mission: Main Surveys—An Overview of Framework and Simulation Suite,Res. Astron. Astrophys.26(2026) [2511.06970]

work page arXiv 2026
[45]

Zhang, Y.-d

X. Zhang, Y.-d. Fang, C.-l. Wei, G.-l. Li, F.-s. Liu, H.-x. Ji et al.,Mock Observations for the CSST Mission: Main Surveys—the Slitless Spectroscopy Simulation,Res. Astron. Astrophys. 26(2026) [2511.06917]. – 15 –

work page arXiv 2026

[1] [1]

Data Release 1 of the Dark Energy Spectroscopic Instrument

DESI Collaboration, M. Abdul-Karim, A.G. Adame, D. Aguado, J. Aguilar, S. Ahlen et al., Data Release 1 of the Dark Energy Spectroscopic Instrument,arXiv e-prints(2025) arXiv:2503.14745 [2503.14745]

work page internal anchor Pith review Pith/arXiv arXiv 2025

[2] [2]

Mellier, Abdurro’uf, J

Euclid Collaboration, Y. Mellier, Abdurro’uf, J.A. Acevedo Barroso, A. Ach´ ucarro, J. Adamek et al.,Euclid: I. Overview of the Euclid mission,Astron. and Astrophys.697(2025) A1 [2405.13491]

work page arXiv 2025

[3] [3]

Bock, A.M

J.J. Bock, A.M. Aboobaker, J. Adamo, R. Akeson, J.M. Alred, F. Alibay et al.,The SPHEREx Satellite Mission,arXiv e-prints(2025) arXiv:2511.02985 [2511.02985]

work page arXiv 2025

[4] [4]

Gebhardt , E

K. Gebhardt, E. Mentuch Cooper, R. Ciardullo, V. Acquaviva, R. Bender, W.P. Bowman et al.,The Hobby-Eberly Telescope Dark Energy Experiment (HETDEX) Survey Design, Reductions, and Detections,Astrophys. J.923(2021) 217 [2110.04298]

work page arXiv 2021

[5] [5]

Wide-Field InfrarRed Survey Telescope-Astrophysics Focused Telescope Assets WFIRST-AFTA 2015 Report

D. Spergel, N. Gehrels, C. Baltay, D. Bennett, J. Breckinridge, M. Donahue et al.,Wide-Field InfrarRed Survey Telescope-Astrophysics Focused Telescope Assets WFIRST-AFTA 2015 Report,arXiv e-prints(2015) arXiv:1503.03757 [1503.03757]

work page internal anchor Pith review Pith/arXiv arXiv 2015

[6] [6]

CSST Collaboration, Y. Gong, H. Miao, H. Zhan, Z.-Y. Li, J. Shangguan et al.,Introduction to the China Space Station Telescope (CSST),arXiv e-prints(2025) arXiv:2507.04618 [2507.04618]

work page arXiv 2025

[7] [7]

Wen, X.Z

R. Wen, X.Z. Zheng, Y. Han, X. Yang, X. Wang, H. Zou et al.,CSST large-scale structure analysis pipeline: II. The CSST Emulator for Slitless Spectroscopy,Mon. Not. Roy. Astron. Soc.528(2024) 2770 [2401.04171]

work page arXiv 2024

[8] [8]

Copin, M

Euclid Collaboration, Y. Copin, M. Fumana, C. Mancini, P.N. Appleton, R. Chary et al., Euclid Quick Data Release (Q1): From spectrograms to spectra: the SIR spectroscopic Processing Function,arXiv e-prints(2025) arXiv:2503.15307 [2503.15307]

work page arXiv 2025

[9] [9]

Le Brun, M

Euclid Collaboration, V. Le Brun, M. Bethermin, M. Moresco, D. Vibert, D. Vergani et al., Euclid Quick Data Release (Q1) – Characteristics and limitations of the spectroscopic measurements,arXiv e-prints(2025) arXiv:2503.15308 [2503.15308]

work page arXiv 2025

[10] [10]

Monaco, M.Y

Euclid Collaboration, P. Monaco, M.Y. Elkhashab, B.R. Granett, J. Salvalaggio, E. Sefusatti et al.,Euclid preparation. Controlling angular systematics in the Euclid spectroscopic galaxy sample,arXiv e-prints(2025) arXiv:2511.20856 [2511.20856]

work page arXiv 2025

[11] [11]

Interloper bias in future large-scale structure surveys

A.R. Pullen, C.M. Hirata, O. Dor´ e and A. Raccanelli,Interloper bias in future large-scale structure surveys,Publ. Astron. Soc. Jap.68(2016) 12 [1507.05092]. – 13 –

work page internal anchor Pith review Pith/arXiv arXiv 2016

[12] [12]

Bayesian Redshift Classification of Emission-line Galaxies with Photometric Equivalent Widths

A.S. Leung, V. Acquaviva, E. Gawiser, R. Ciardullo, E. Komatsu, A.I. Malz et al.,Bayesian Redshift Classification of Emission-line Galaxies with Photometric Equivalent Widths, Astrophys. J.843(2017) 130 [1510.07043]

work page internal anchor Pith review Pith/arXiv arXiv 2017

[13] [13]

The Impact of Line Misidentification on Cosmological Constraints from Euclid and other Spectroscopic Galaxy Surveys

G.E. Addison, C.L. Bennett, D. Jeong, E. Komatsu and J.L. Weiland,The Impact of Line Misidentification on Cosmological Constraints from Euclid and Other Spectroscopic Galaxy Surveys,Astrophys. J.879(2019) 15 [1811.10668]

work page internal anchor Pith review Pith/arXiv arXiv 2019

[14] [14]

Unbiased Cosmological Parameter Estimation from Emission Line Surveys with Interlopers

H.S. Grasshorn Gebhardt, D. Jeong, H. Awan, J.S. Bridge, R. Ciardullo, D. Farrow et al., Unbiased Cosmological Parameter Estimation from Emission-line Surveys with Interlopers, Astrophys. J.876(2019) 32 [1811.06982]

work page internal anchor Pith review Pith/arXiv arXiv 2019

[15] [15]

Awan and E

H. Awan and E. Gawiser,Angular Correlation Function Estimators Accounting for Contamination from Probabilistic Distance Measurements,Astrophys. J.890(2020) 78 [1911.07832]

work page arXiv 2020

[16] [16]

Massara, S

E. Massara, S. Ho, C.M. Hirata, J. DeRose, R.H. Wechsler and X. Fang,Line confusion in spectroscopic surveys and its possible effects: shifts in Baryon Acoustic Oscillations position, Mon. Not. Roy. Astron. Soc.508(2021) 4193 [2010.00047]

work page arXiv 2021

[17] [17]

Hilmi, N

M. Hilmi, N. Leethochawalit, M. Trenti and B. Metha,A novel analysis of contamination in Lyman-break galaxy samples at z 6-8: spatial correlation with intermediate-redshift galaxies at z 1.3-2,Mon. Not. Roy. Astron. Soc.532(2024) 920 [2501.00301]

work page arXiv 2024

[18] [18]

Risso, A

Euclid Collaboration, I. Risso, A. Veropalumbo, E. Branchini, E. Maragliano, S. de la Torre et al.,Euclid preparation. The impact of redshift interlopers on the two-point correlation function analysis,arXiv e-prints(2025) arXiv:2505.04688 [2505.04688]

work page arXiv 2025

[19] [19]

J. Sui, H. Zou, X. Yang, X. Zheng, R. Wen, Y. Gu et al.,CSST large scale structure analysis pipeline: III. Emission-line redshift measurement for slitless spectra,Mon. Not. Roy. Astron. Soc.538(2025) 395 [2502.11536]

work page arXiv 2025

[20] [20]

Cagliari, B.R

M.S. Cagliari, B.R. Granett, L. Guzzo, M. Bethermin, M. Bolzonella, S. de la Torre et al., Euclid: Testing photometric selection of emission-line galaxy targets,Astron. and Astrophys. 689(2024) A166 [2403.08726]

work page arXiv 2024

[21] [21]

The DEEP2 Galaxy Redshift Survey: Redshift Identification of Single-Line Emission Galaxies

E.N. Kirby, P. Guhathakurta, S.M. Faber, D.C. Koo, B.J. Weiner and M.C. Cooper,The DEEP2 Galaxy Redshift Survey: Redshift Identification of Single-Line Emission Galaxies, Astrophys. J.660(2007) 62 [astro-ph/0701747]

work page internal anchor Pith review Pith/arXiv arXiv 2007

[22] [22]

Davis, K

D. Davis, K. Gebhardt, E.M. Cooper, R. Ciardullo, M. Fabricius, D.J. Farrow et al.,The HETDEX Survey Emission-line Exploration and Source Classification,Astrophys. J.946 (2023) 86 [2301.01799]

work page arXiv 2023

[23] [23]

Farrow, A.G

D.J. Farrow, A.G. S´ anchez, R. Ciardullo, E.M. Cooper, D. Davis, M. Fabricius et al., Correcting correlation functions for redshift-dependent interloper contamination,Mon. Not. Roy. Astron. Soc.507(2021) 3187 [2104.04613]

work page arXiv 2021

[24] [24]

Y. Gong, H. Miao, P. Zhang and X. Chen,Self-calibrating Interloper Bias in Spectroscopic Galaxy-clustering Surveys,Astrophys. J.919(2021) 12 [2107.04745]

work page arXiv 2021

[25] [25]

Foroozan, E

S. Foroozan, E. Massara and W.J. Percival,Correcting for small-displacement interlopers in BAO analyses,JCAP.2022(2022) 072 [2208.05001]

work page arXiv 2022

[26] [26]

Peng and Y

H. Peng and Y. Yu,Precise self-calibration of interloper bias in spectroscopic surveys,Mon. Not. Roy. Astron. Soc.526(2023) 820 [2305.10487]

work page arXiv 2023

[27] [27]

Nguyen, E

A.B.H. Nguyen, E. Massara and W.J. Percival,Self-calibrating BAO measurements in the presence of small displacement interlopers,JCAP.2024(2024) 008 [2311.14210]

work page arXiv 2024

[28] [28]

Bernal and A

J.L. Bernal and A. Baleato Lizancos,Removal of interloper contamination to line-intensity maps using correlations with ancillary tracers of the large-scale structure,Phys. Rev. D.111 (2025) 043539 [2406.12979]. – 14 –

work page arXiv 2025

[29] [29]

Massara, F

E. Massara, F. Villaescusa-Navarro and W.J. Percival,Predicting interloper fraction with graph neural networks,JCAP.2023(2023) 012 [2309.05850]

work page arXiv 2023

[30] [30]

Cagliari, A

M.S. Cagliari, A. Moradinezhad Dizgah and F. Villaescusa-Navarro,Correcting for Interloper Contamination in the Power Spectrum with Neural Networks,Astrophys. J.991(2025) 48 [2504.06919]

work page arXiv 2025

[31] [31]

H. Zhan,The wide-field multiband imaging and slitless spectroscopy survey to be carried out by the survey space telescope of china manned space program,Chinese Science Bulletin66(2021) 1290

work page 2021

[32] [32]

Y. Cao, Y. Gong, X.-M. Meng, C.K. Xu, X. Chen, Q. Guo et al.,Testing photometric redshift measurements with filter definition of the Chinese Space Station Optical Survey (CSS-OS), Mon. Not. Roy. Astron. Soc.480(2018) 2178 [1706.09586]

work page internal anchor Pith review Pith/arXiv arXiv 2018

[33] [33]

Y. Gong, H. Miao, X. Zhou, Q. Xiong, Y. Song, Y. Jiang et al.,Future cosmology: New physics and opportunity from the China Space Station Telescope (CSST),Science China Physics, Mechanics, and Astronomy68(2025) 280402 [2501.15023]

work page arXiv 2025

[34] [34]

Y. Gu, X. Yang, J. Han, Y. Wang, Q. Li, Z. Tan et al.,CSST large-scale structure analysis pipeline: I. Constructing reference mock galaxy redshift surveys,Mon. Not. Roy. Astron. Soc. 529(2024) 4015 [2403.10754]

work page arXiv 2024

[35] [35]

J. Han, M. Li, W. Jiang, Z. Chen, H. Wang, C. Wei et al.,The Jiutian simulations for the CSST extra-galactic surveys,Science China Physics, Mechanics, and Astronomy68(2025) 109511 [2503.21368]

work page arXiv 2025

[36] [36]

Overview of the DESI Legacy Imaging Surveys

A. Dey, D.J. Schlegel, D. Lang, R. Blum, K. Burleigh, X. Fan et al.,Overview of the DESI Legacy Imaging Surveys,Astron. J.157(2019) 168 [1804.08657]

work page internal anchor Pith review Pith/arXiv arXiv 2019

[37] [37]

X. Yang, H. Xu, M. He, Y. Gu, A. Katsianis, J. Meng et al.,An Extended Halo-based Group/Cluster Finder: Application to the DESI Legacy Imaging Surveys DR8,Astrophys. J. 909(2021) 143 [2012.14998]

work page arXiv 2021

[38] [38]

X. Zhou, Y. Gong, X. Zhang, N. Li, X.-M. Meng, X. Chen et al.,Accurately Estimating Redshifts from CSST Slitless Spectroscopic Survey Using Deep Learning,Astrophys. J.977 (2024) 69 [2407.13991]

work page arXiv 2024

[39] [39]

XGBoost: A Scalable Tree Boosting System

T. Chen and C. Guestrin,XGBoost: A Scalable Tree Boosting System,arXiv e-prints(2016) arXiv:1603.02754 [1603.02754]

work page internal anchor Pith review Pith/arXiv arXiv 2016

[40] [40]

Shwartz-Ziv and A

R. Shwartz-Ziv and A. Armon,Tabular Data: Deep Learning is Not All You Need,Information Fusion81(2022) 84 [2106.03253]

work page arXiv 2022

[41] [41]

doi:10.48550/arXiv.2207.08815 , urldate =

L. Grinsztajn, E. Oyallon and G. Varoquaux,Why do tree-based models still outperform deep learning on tabular data?,arXiv e-prints(2022) arXiv:2207.08815 [2207.08815]

work page arXiv 2022

[42] [42]

A Unified Approach to Interpreting Model Predictions

S. Lundberg and S.-I. Lee,A Unified Approach to Interpreting Model Predictions,arXiv e-prints(2017) arXiv:1705.07874 [1705.07874]

work page internal anchor Pith review Pith/arXiv arXiv 2017

[43] [43]

Mentuch Cooper, K

E. Mentuch Cooper, K. Gebhardt, D. Davis, D.J. Farrow, C. Liu, G. Zeimann et al.,HETDEX Public Source Catalog 1: 220 K Sources Including Over 50 K LyαEmitters from an Untargeted Wide-area Spectroscopic Survey,Astrophys. J.943(2023) 177 [2301.01826]

work page arXiv 2023

[44] [44]

Wei, G.-L

C.-L. Wei, G.-L. Li, Y.-D. Fang, X. Zhang, Y. Luo, H. Tian et al.,Mock Observations for the CSST Mission: Main Surveys—An Overview of Framework and Simulation Suite,Res. Astron. Astrophys.26(2026) [2511.06970]

work page arXiv 2026

[45] [45]

Zhang, Y.-d

X. Zhang, Y.-d. Fang, C.-l. Wei, G.-l. Li, F.-s. Liu, H.-x. Ji et al.,Mock Observations for the CSST Mission: Main Surveys—the Slitless Spectroscopy Simulation,Res. Astron. Astrophys. 26(2026) [2511.06917]. – 15 –

work page arXiv 2026