pith. machine review for the scientific record. sign in

arxiv: 2604.12414 · v1 · submitted 2026-04-14 · 🌌 astro-ph.GA · astro-ph.IM

Recognition: no theorem link

Enhancing Ly{α} Emitter Identification in HETDEX with a Convolutional Neural Network

Caryl Gronwall, Daniel J. Farrow, Donald P. Schneider, Dustin Davis, Eric Gawiser, Erin Mentuch Cooper, Julian B. Mu\~noz, Karl Gebhardt, Lindsay R. House, Mahdi Qezlou, Shiro Mukae, Shun Saito

Pith reviewed 2026-05-10 15:57 UTC · model grok-4.3

classification 🌌 astro-ph.GA astro-ph.IM
keywords Lyα emittersHETDEXconvolutional neural networkspectroscopic surveylow signal-to-noiseemission line identificationredshift distributiongalaxy clustering
0
0 comments X

The pith

A convolutional neural network trained on two-dimensional spectral images identifies low signal-to-noise Lyα emitters in HETDEX data with 93 percent recovery of confirmed sources.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a convolutional neural network to improve identification of Lyα emitters in the untargeted HETDEX spectroscopic survey, focusing on the low signal-to-noise regime between 4.8 and 5.5 where noise and artifacts dominate. The model learns from two-dimensional images of single emission lines drawn from the HETDEX COSMOS catalog, supplemented by ancillary observations and citizen-science labels. It reaches balanced accuracy of 94.1 percent at high S/N and 85.1 percent at low S/N, with independent DESI spectroscopy confirming recovery of 99 percent and 93 percent of the respective populations. When applied to the full catalog the network removes spurious features in the redshift distribution between z approximately 1.9 and 2.5, permitting a lower S/N cutoff that supports cleaner galaxy-clustering measurements.

Core claim

The authors show that a convolutional neural network applied to two-dimensional spectroscopic images of single emission lines can separate true Lyα emitters from artifacts and sky residuals. In the high-S/N regime above 5.5 the network attains 94.1 percent balanced accuracy, 97.5 percent precision, and 97.5 percent recall. In the low-S/N regime from 4.8 to 5.5 these figures are 85.1 percent, 78.2 percent, and 84.4 percent. Tested against HETDEX LAEs confirmed by DESI spectroscopy, the model recovers 99 percent of high-S/N sources and 93 percent of low-S/N sources. When run on the entire HETDEX catalog it suppresses spurious spikes in the redshift distribution across z approximately 1.9 to 2.

What carries the argument

Convolutional neural network that classifies two-dimensional spectral images of individual emission lines by attending to smooth central emission in true positives and irregular or noisy patterns in contaminants.

If this is right

  • The network recovers 99 percent of high-S/N and 93 percent of low-S/N LAEs independently confirmed by DESI spectroscopy.
  • It achieves 94.1 percent balanced accuracy at S/N greater than 5.5 and 85.1 percent at S/N between 4.8 and 5.5.
  • Application to the full catalog removes spurious redshift spikes between z approximately 1.9 and 2.5.
  • The cleaned sample mitigates false positives in galaxy clustering statistics used for cosmological analyses.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same image-based classification approach could be retrained on data from other untargeted spectroscopic surveys to extend reliable emission-line detection to fainter limits.
  • Visual attribution maps that highlight smooth extended emission suggest the network is learning physical source morphology rather than purely statistical noise patterns.
  • Lowering the S/N threshold while controlling contaminants may enlarge the usable LAE sample enough to tighten constraints on large-scale structure at z approximately 2.

Load-bearing premise

The training set assembled from the HETDEX COSMOS field plus ancillary and citizen-science labels captures the full range of artifacts, sky residuals, and instrumental effects present across the entire survey footprint.

What would settle it

Applying the trained network to an independent HETDEX field outside the COSMOS training area and finding either persistent spurious spikes in the redshift distribution between z 1.9 and 2.5 or recovery below 90 percent of DESI-confirmed low-S/N LAEs.

Figures

Figures reproduced from arXiv: 2604.12414 by Caryl Gronwall, Daniel J. Farrow, Donald P. Schneider, Dustin Davis, Eric Gawiser, Erin Mentuch Cooper, Julian B. Mu\~noz, Karl Gebhardt, Lindsay R. House, Mahdi Qezlou, Shiro Mukae, Shun Saito.

Figure 1
Figure 1. Figure 1: Redshift distribution and cumulative number of HETDEX LAE candidates as a function of S/N [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Schematic illustration of (a) fiber layout, (b) fiber spectral arrays, (c) 2D spectral images, and (d) [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Schematic overview of the CNN architecture employed in this work. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Histograms for HETDEX COSMOS LAE candidates: (a) Redshift, (b) Ly [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Learning curves of CNN model training with three-fold cross-validation. Top: [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Performance metrics of the CNN model. The trained model is applied to the test set (1697 2D spectral images) drawn from the HETDEX COSMOS catalog. (a) Distributions of CNN scores. The orange and blue histograms represent sources labeled as Likely Real and Unlikely Real, respectively. The left and right panels show the high-S/N (S/N > 5.5) and low-S/N (4.8 ≤ S/N ≤ 5.5) subsets. (b) Confusion matrices at CNN… view at source ↗
Figure 7
Figure 7. Figure 7: Precision curves as a function of CNN score threshold across various S/N limits. [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Classification examples for the TP, FP, FN, and TN categories. [PITH_FULL_IMAGE:figures/full_fig_p018_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Grad-CAM++ maps for the TP, FP, FN, and TN classifications. [PITH_FULL_IMAGE:figures/full_fig_p019_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Redshift distribution of DESI-confirmed HETDEX LAEs used in this study, along with histograms [PITH_FULL_IMAGE:figures/full_fig_p020_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Distribution of CNN scores for LAE can￾didates in the HETDEX DEX catalog across differ￾ent S/N ranges. The dark and light gray indicate sources with S/N > 5.5 and S/N ≤ 5.5, respectively. candidates selected at a CNN score threshold of 0.5, where the overall precision reaches nearly 90% when the high-S/N and low-S/N regimes are combined (Sec￾tion 4.1). Both the distributions closely resemble those of the … view at source ↗
Figure 12
Figure 12. Figure 12: Distribution of Lyα luminosity and FWHM for LAE candidates in the HETDEX DEX catalog at a CNN score threshold of 0.5. The yellow and blue colors indicate LAE candidates with a CNN scores > 0.5 and ≤ 0.5, respectively. The vertical black dashed line in the left panel marks the Lyα luminosity of ∼ 1.0×1042.8 erg s−1 at z ∼ 2.7, which corresponds to the HETDEX survey’s sensitivity limit, as in [PITH_FULL_IM… view at source ↗
Figure 13
Figure 13. Figure 13: Redshift distributions of LAE candidates selected with different CNN score thresholds in the [PITH_FULL_IMAGE:figures/full_fig_p024_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Cumulative number of LAE candidates selected with different CNN score thresholds in the HETDEX DEX catalog. Same as the right panel of Fig￾ure 1, but for LAE candidates selected using CNN score thresholds of 0.70 (cyan), 0.50 (blue), and 0.30 (green), along with data quality flags (gray). Each curve shows the cu￾mulative number of LAE candidates per VIRUS IFU as a function of the S/N threshold. The dashed… view at source ↗
read the original abstract

We present a deep learning framework to enhance the identification of Ly$\alpha$ emitters (LAEs) in the Hobby-Eberly Telescope Dark Energy Experiment (HETDEX), an untargeted spectroscopic survey of LAEs at $1.9 < z < 3.5$ without imaging pre-selection. We primarily address the low signal-to-noise ratio (S/N) regime ($4.8 \leq \mathrm{S/N} \leq 5.5$), where LAE candidates suffer from substantial noise contamination. To distinguish LAE candidates from artifacts and sky residuals, we employ a convolutional neural network (CNN) trained on two-dimensional spectral images of single emission lines. The training sample is constructed from the HETDEX COSMOS catalog, with external validation from ancillary observations and our participatory science project, \textit{Dark Energy Explorers}. For small-format, low-resolution spectroscopic data, the model achieves a balanced accuracy, precision, and recall of $94.1\%$, $97.5\%$, and $97.5\%$, respectively, in the high-S/N regime ($\mathrm{S/N}>5.5$), and $85.1\%$, $78.2\%$, and $84.4\%$ in the low-S/N regime. Using HETDEX LAEs independently identified by DESI spectroscopy, the model recovers $99\%$ and $93\%$ of the high- and low-S/N LAEs, respectively. Visual attribution indicates that the CNN attends to smooth, spatially extended central emission in true positives and to irregular or noisy features in true negatives. Applied to the full HETDEX catalog, the CNN enables an S/N threshold down to 4.8 by suppressing spurious spikes across $z\sim 1.9$--$2.5$ in the redshift distribution. Our approach facilitates HETDEX cosmological analyses by mitigating false positives in galaxy clustering and highlights the value of domain-specific deep learning for refining low-S/N spectroscopic identification in untargeted surveys.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper presents a convolutional neural network (CNN) trained on 2D spectral images of single emission lines from the HETDEX COSMOS catalog (augmented with ancillary observations and citizen-science labels from Dark Energy Explorers) to classify Lyα emitters (LAEs) versus artifacts in the low-S/N regime (4.8 ≤ S/N ≤ 5.5). It reports balanced accuracy/precision/recall of 94.1%/97.5%/97.5% (high-S/N >5.5) and 85.1%/78.2%/84.4% (low-S/N), recovers 99% and 93% of high- and low-S/N LAEs independently confirmed by DESI spectroscopy, and applies the model to the full HETDEX catalog to lower the S/N threshold to 4.8 while suppressing spurious spikes in the redshift distribution between z~1.9–2.5.

Significance. If the reported generalization holds, the result would be significant for untargeted spectroscopic surveys: it provides a practical, interpretable (via visual attribution) method to increase LAE sample completeness without inflating false positives that bias galaxy clustering measurements. The use of external DESI validation and participatory-science labels is a clear strength, and the concrete recovery fractions on held-out data support the central empirical claim.

major comments (3)
  1. [Training sample construction and full-catalog application] The training sample is drawn exclusively from the HETDEX COSMOS catalog. No cross-field validation or ablation on other HETDEX pointings is presented to test whether sky residuals, instrumental spikes, or noise properties differ outside COSMOS. This assumption is load-bearing for the claim that the CNN can be applied to the full 540 deg² catalog and safely lower the S/N threshold to 4.8 while suppressing redshift spikes (abstract and § on full-catalog application).
  2. [DESI validation results] The DESI validation set yields the headline recovery rates (99%/93%). The manuscript must explicitly state the sky overlap and observing-condition similarity between the DESI fields used for validation and the COSMOS training field; if they largely coincide, the test does not probe out-of-distribution behavior across the full footprint.
  3. [Methods and model description] The abstract and results quote specific accuracy/precision/recall numbers, yet the manuscript provides insufficient detail on CNN architecture (layers, kernel sizes, input preprocessing), training protocol (loss, optimizer, hyperparameters, data augmentation), and the precise train/validation/test split. These choices are free parameters that directly affect the quoted metrics and must be documented for reproducibility.
minor comments (2)
  1. [S/N regime definition] The S/N regime boundaries (4.8 and 5.5) are stated without derivation or sensitivity test; a brief justification or robustness check would clarify whether they are data-driven or chosen ad hoc.
  2. [Interpretability analysis] Visual attribution maps are described qualitatively; a quantitative comparison (e.g., overlap with emission-line centroids) would strengthen the interpretability section.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed review of our manuscript. We address each major comment point by point below, providing clarifications where possible and committing to revisions that strengthen the paper without misrepresenting our results.

read point-by-point responses
  1. Referee: [Training sample construction and full-catalog application] The training sample is drawn exclusively from the HETDEX COSMOS catalog. No cross-field validation or ablation on other HETDEX pointings is presented to test whether sky residuals, instrumental spikes, or noise properties differ outside COSMOS. This assumption is load-bearing for the claim that the CNN can be applied to the full 540 deg² catalog and safely lower the S/N threshold to 4.8 while suppressing redshift spikes (abstract and § on full-catalog application).

    Authors: We selected the COSMOS field for training because it provides the richest ancillary data and citizen-science labels from Dark Energy Explorers, enabling reliable ground-truth classification. All HETDEX pointings employ the same VIRUS instrument and similar observing protocols, so instrumental artifacts and noise characteristics are expected to be consistent across the survey. Nevertheless, we acknowledge the value of explicit cross-field testing. In the revised manuscript we will add a dedicated paragraph in the Methods and Discussion sections justifying the representativeness of COSMOS data, quantifying the uniformity of HETDEX observing conditions, and explicitly noting the limitation that full cross-field ablation was not performed. We will also state that future work will incorporate additional fields once labeled data become available. revision: partial

  2. Referee: [DESI validation results] The DESI validation set yields the headline recovery rates (99%/93%). The manuscript must explicitly state the sky overlap and observing-condition similarity between the DESI fields used for validation and the COSMOS training field; if they largely coincide, the test does not probe out-of-distribution behavior across the full footprint.

    Authors: We will revise the relevant section to provide a clear statement of the sky overlap and observing conditions. The DESI-confirmed LAEs used for validation come from HETDEX fields that include both the COSMOS region and additional pointings outside COSMOS, thereby offering a partial out-of-distribution test. We will quantify the fractional overlap, note the similarity in exposure times and seeing, and discuss how the 93 % low-S/N recovery rate supports generalization beyond the training field. revision: yes

  3. Referee: [Methods and model description] The abstract and results quote specific accuracy/precision/recall numbers, yet the manuscript provides insufficient detail on CNN architecture (layers, kernel sizes, input preprocessing), training protocol (loss, optimizer, hyperparameters, data augmentation), and the precise train/validation/test split. These choices are free parameters that directly affect the quoted metrics and must be documented for reproducibility.

    Authors: We agree that the current Methods section lacks sufficient detail for full reproducibility. In the revised manuscript we will expand the architecture description to list the number of convolutional layers, kernel sizes, stride, padding, activation functions, and pooling operations; specify all preprocessing steps (normalization, resizing, and noise handling); detail the training protocol including the loss function, optimizer, learning-rate schedule, batch size, number of epochs, and early-stopping criteria; describe the data-augmentation pipeline; and report the exact train/validation/test split fractions together with the rationale for the split. A summary table of all hyperparameters will be added. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical ML performance on independent validation data

full rationale

The paper reports an empirical CNN classifier trained on HETDEX COSMOS spectra and validated against DESI spectroscopy plus citizen-science labels. Performance numbers (99%/93% recovery, balanced accuracy 94.1%/85.1%) are measured on held-out or externally labeled sets rather than derived from any equation that reduces to the training inputs by construction. No self-definitional loops, fitted parameters renamed as predictions, or load-bearing self-citations appear in the reported results. The central claim is a data-driven classification improvement, not a theoretical derivation.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The claim rests on standard supervised-learning assumptions plus domain-specific choices of training data and S/N cuts; no new physical entities are postulated.

free parameters (2)
  • CNN architecture and training hyperparameters
    Layer count, filter sizes, learning rate, and regularization chosen to maximize validation metrics on the COSMOS training set.
  • S/N regime boundaries (4.8 and 5.5)
    Thresholds selected after model evaluation to balance sample size against contamination.
axioms (1)
  • domain assumption 2D spectral images of single emission lines contain distinguishable morphological features for real LAEs versus artifacts
    Invoked by training the CNN directly on these images rather than extracted 1D spectra.

pith-pipeline@v0.9.0 · 5726 in / 1491 out tokens · 81821 ms · 2026-05-10T15:57:09.140047+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Unveiling Hidden Lyman Alpha Emitters in the DESI DR1 Data

    astro-ph.GA 2026-05 unverdicted novelty 5.0

    A CNN detects 19,685 LAEs at z=2-3.5 in DESI DR1 spectra with 95% purity and completeness.

Reference graph

Works this paper leans on

3 extracted references · 1 canonical work pages · cited by 1 Pith paper · 1 internal anchor

  1. [1]

    , " * write output.state after.block = add.period write newline

    ENTRY address archivePrefix author booktitle chapter doi edition editor eprint howpublished institution journal key month number organization pages publisher school series title misctitle type volume year version url label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts ...

  2. [2]

    write newline

    " write newline "" before.all 'output.state := FUNCTION format.url url empty "" new.block "" url * "" * if FUNCTION format.eprint eprint empty "" archivePrefix empty "" archivePrefix "arXiv" = new.block " " eprint * " " * new.block " " eprint * " " * if if if FUNCTION format.doi doi empty "" " " doi * " " * if FUNCTION format.pid doi empty eprint empty ur...

  3. [3]

    Adam: A Method for Stochastic Optimization

    thebibliography [1] 20pt to REFERENCES 6pt =0pt -12pt 10pt plus 3pt =0pt =0pt =1pt plus 1pt =0pt =0pt -12pt =13pt plus 1pt =20pt =13pt plus 1pt \@M =10000 =-1.0em =0pt =0pt 0pt =0pt =1.0em @enumiv\@empty 10000 10000 `\.\@m \@noitemerr \@latex@warning Empty `thebibliography' environment \@ifnextchar \@reference \@latexerr Missing key on reference command E...