pith. sign in

arxiv: 2503.03816 · v2 · submitted 2025-03-05 · 🌌 astro-ph.GA · cs.LG

The Optical and Infrared Are Connected

Pith reviewed 2026-05-23 01:00 UTC · model grok-4.3

classification 🌌 astro-ph.GA cs.LG
keywords galaxy spectrainfrared photometrySED fittingneural networksAGN luminositydust parametersstar formation history
0
0 comments X

The pith

A neural summary of optical SDSS spectra predicts WISE infrared photometry to χ²_N ≈1 accuracy by capturing physical correlations across wavelengths.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Standard galaxy models treat optical and infrared emission as weakly connected because they arise from separate physical components. This paper instead builds a data-driven model that uses a neural network to compress optical spectra and then predicts the full set of WISE infrared bands. The predictions match observed photometry at the level of χ²_N ≈1 and also recover colors and derived quantities such as AGN bolometric luminosity and PAH dust fraction. When the same galaxies are run through current spectral-energy-distribution fitting codes the results are biased and overconfident, with the largest offsets appearing in star-formation rate and AGN luminosity. The optical spectral features that drive the accurate infrared predictions are a small set of absorption and emission lines whose strengths trace the timing of star formation and chemical enrichment.

Core claim

The central claim is that optical and infrared galaxy light are tightly linked through shared physical processes, so that a compact neural summary of an SDSS optical spectrum alone is sufficient to predict WISE photometry at χ²_N ≈1 for every band; the same summary also constrains AGN luminosities and dust parameters, while existing SED models produce biased and overconfident estimates because they do not reproduce these panchromatic relations.

What carries the argument

A neural-network encoder that produces a low-dimensional summary of an optical spectrum and is trained to regress directly onto WISE infrared fluxes, thereby learning the cross-wavelength correlations without explicit physical templates.

If this is right

  • AGN bolometric luminosities and dust parameters such as q_PAH can be inferred from optical spectra alone at the precision previously requiring infrared data.
  • Current SED-fitting pipelines return systematically biased star-formation rates and AGN luminosities because they miss the optical-infrared correlations.
  • The lines Ca II, Sr II, Fe I, [O II], and Hα are the dominant contributors to the infrared prediction, indicating that star-formation chronology is the key missing ingredient in existing models.
  • Panchromatic galaxy properties can be recovered without separate infrared observations once the optical-infrared mapping is learned from data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same encoder approach could be used to predict far-infrared or radio fluxes from optical spectra, testing whether the correlations extend to other wavelengths.
  • If the identified lines are added as explicit constraints in future SED codes, the biases in star-formation rate and AGN luminosity should decrease.
  • Surveys that obtain only optical spectra could still produce infrared-derived catalogs by applying the trained model, provided the training distribution matches the target population.

Load-bearing premise

The neural summary extracts the physical correlations that link optical features to infrared emission rather than simply memorizing survey-specific selection effects or noise patterns in the training data.

What would settle it

Apply the trained model to a completely independent sample of galaxies that have both SDSS spectra and WISE photometry but were never seen during training; if the χ²_N values remain near 1 and the recovered AGN and dust parameters agree with direct infrared measurements, the claim holds.

Figures

Figures reproduced from arXiv: 2503.03816 by Andy D. Goulding, ChangHoon Hahn, Christian K. Jespersen, David N. Spergel, Kartheik G. Iyer, Peter Melchior.

Figure 1
Figure 1. Figure 1: The redshift distribution of the objects used in this work [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: A demonstration of predicting WISE photometry (red markers) from the SDSS spectrum (black), shown in observed wavelengths. Although they fit the optical spectrum perfectly, the predictions from the SED models (left panel) have typical deviations of ∼ 10σ, whereas our empirical approach (right panel) produces typical deviations of ∼ 1σ. That the photometry can empirically be predicted to ∼ 1σ implies that I… view at source ↗
Figure 3
Figure 3. Figure 3: PP-plot showing the cumulative probability that a sample from the posterior from the empirical model, prospector, or CIGALE will be consistent with the data. A summary of the mean probabilities can be found in [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Color-color as measured from WISE (left) and the predictions of our empirical model (right). All data shown are from the test set. The empirical mapping closely reproduces the observed color-color distributions. Extreme values, especially in the bands with large errors (W3 and W4), are disfavoured by the model because they are likely to be produced by measurement error. However, very high values of W1-W2 a… view at source ↗
Figure 6
Figure 6. Figure 6: χ as a function of WISE aperture radius for the predictions from our empirical model, in units of magnitude. Binned trends are plotted for ease of visualization, and show that our model underpredicts the flux (overpredict the mag￾nitude) for larger galaxies. In [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
Figure 5
Figure 5. Figure 5: prospector priors (grey) and posteriors (blue) for a randomly selected galaxy. The posteriors occupy only a very small part of the prior space. Note that the posterior for fAGN is barely visible, and concentrated in the bottom right corner. There is no AGN in this galaxy. by XMM-NEWTON, since extended sources are likely to be cluster galaxies instead of AGN. To avoid star￾forming galaxies, we must implemen… view at source ↗
Figure 7
Figure 7. Figure 7: The color-color predictions of our empirical model as shown in [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Predictions of Lbol based only on optical information indicate a surprisingly high degree of determinability of physical AGN parameters from the optical. The AGN are X-ray selected, and fitted with CIGALE using optical spectra, J, H, Ks, and WISE photometry. For objects with broad lines (class B, 1000 km/s < vFWHM < 2000 km/s), the predictability improves a little, whereas for objects with very broad lines… view at source ↗
Figure 9
Figure 9. Figure 9: Correlation between X-ray luminosity (LX) and 6µm luminosity (L6µm) found using either real data or the predicted photometry. Both sets of L6µm closely follow the Chen et al. (2017) XR-MIR relation for AGNs. There is a small bias towards lower L6µm for the predicted WISE photometry, but it is only on the order of 0.06 dex. tical spectrum. The performance of the empirical model is then evaluated on a held-o… view at source ↗
Figure 11
Figure 11. Figure 11: AGN and qPAH emission dominate the IR, but can be undetectable in the optical. If we can constrain these processes by using information from the optical, there must be a tight underlying coupling between the processes which dominate the optical and those which dominate the IR. 5.2. Consequences of the separability assumption Although it is now clear that the assumption of sep￾arability is wrong, it is tem… view at source ↗
Figure 12
Figure 12. Figure 12: The correlation between changes in prospector fAGN (the AGN luminosity as a fraction of the galaxy luminosity) and SFR2 (SFR on 300 Myr - 1 Gyr timescales) from using the optical only to being constrained by both an SDSS spectrum and WISE photometry. The mis￾estimates are very correlated, meaning that although SFR2 is not biased on a population-level, any given galaxy could con￾sistently have misestimated… view at source ↗
Figure 14
Figure 14. Figure 14: Feature attribution from multiscale occlusion (blue) overplotted on average spectra (black) for a range of different line systems (red) for the W3 (12.1 µm) band. The spectra are split into two groups, starforming (top row, SF) and quiescent (bottom row, Q) based on Hα strength. The blue feature importance line goes up if an increased strength of the feature in question (deeper absorption lines/brighter e… view at source ↗
Figure 15
Figure 15. Figure 15: The χ = Fmodel−Fdata σdata distribution for the empirical mapping, CIGALE and prospector for all galaxies from the test set with detections in a given band. The empirical model predicts the photometry to χ 2 N ≈ 1, indicating a high degree of coupling between the optical and IR emission of all galaxies. The empirical mapping does not produce any extreme outliers, although all galaxy types are included her… view at source ↗
Figure 16
Figure 16. Figure 16: χ 2 N -distribuions foror jointly fitting the optical and IR. The χ = Fmodel−Fdata σdata distribution for the empirical mapping predictions, as well as fits from CIGALE and prospector for all galaxies from the test set with detections in a given band. It is clear that CIGALE and prospector continue to be both biased and imprecise compared to the simple empirical model, even when given the photometry. Howe… view at source ↗
Figure 17
Figure 17. Figure 17: Same as [PITH_FULL_IMAGE:figures/full_fig_p022_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: (left) CIGALE fitted bolometric luminosity versus predicted bolometric luminosity. This shows that the predictions are not biased at any specific bolometric luminosity. (middle) XMM X-ray luminosity versus CIGALE fitted bolometric luminosity. Note that the X-ray’s are not used for fitting. (right) XMM X-ray luminosity versus predicted bolometric luminosity. The relations are overall the same. All luminosi… view at source ↗
Figure 19
Figure 19. Figure 19: The astrometric distance match between WISE and SDSS sources in logarithmically colored bins. Almost all (> 99%) of sources are matched to within half an arcsecond, an excellent crossmatching accuracy. Only a few sources are outside the 2 arcsecond limit. This is only done for very bright/extended sources with no other possible match nearby. B. ADDITIONAL PRECISION FIGURES FOR AGN TASKS C. ADDITIONAL DATA… view at source ↗
Figure 20
Figure 20. Figure 20: (left) Prediction made by an empirical model using only line intensities measured for the set of ionization lines shown in [PITH_FULL_IMAGE:figures/full_fig_p026_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: (left) Result of the multiscale occlusion for the W3 band for starforming galaxies. In contrast to [PITH_FULL_IMAGE:figures/full_fig_p027_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: χ-distributions of our predictions split by brightness in the IR/optical, mass, redshift, SFR and S/N. For all options, we show the highest/lowest 10 percent of the quantity in question. See the text for an expanded description of each quantity. • (top right) The mass, as estimated using the methods of Bilicki et al. (2014) and Cluver et al. (2014). • (middle left) The average brightness in the optical. •… view at source ↗
Figure 23
Figure 23. Figure 23: Width and center of the χ-distributions fits in the optical. If perfect, the distribution would be centered at 0 (orange line) and have ±1σ widths of ±1 (red lines). For perfect fits, the χ 2 N should also be close to 1, and the average χ should be close to 0. The fits are very good in general. with global reduced chi-square close to one, and the average χ value close to 0. Our results demonstrate that th… view at source ↗
read the original abstract

Galaxies are often modelled as composites of separable components with distinct spectral signatures, implying that different wavelength ranges are only weakly correlated. They are not. We present a data-driven model which exploits subtle correlations between physical processes to accurately predict infrared (IR) WISE photometry from a neural summary of optical SDSS spectra. The model achieves accuracies of $\chi^2_N \approx 1$ for all photometric bands in WISE, as well as good colors. We are able to tightly constrain typically IR-derived properties, e.g., the bolometric luminosities of AGN and dust parameters such as $\mathrm{q_{PAH}}$. We also test whether current SED-fitting methods reproduce such panchromatic relations, but find their predictions biased and overconfident, likely due to model misspecification, with correlated biases in star-formation rates and AGN luminosities being most evident. To help improve SED models, we determine which features of the optical spectrum are responsible for our improved predictions, and identify several lines (CaII, SrII, FeI, [OII] and H$\alpha$), which point to the complex chronology of star formation and chemical enrichment being incorrectly modelled.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper presents a data-driven neural network model that uses a neural summary of optical SDSS spectra to predict WISE infrared photometry, achieving normalized chi-squared accuracies of approximately 1 across bands and good colors. It claims this exploits physical correlations to tightly constrain IR-derived quantities such as AGN bolometric luminosities and dust parameters like q_PAH, while demonstrating that standard SED-fitting methods yield biased and overconfident predictions (particularly in SFR and AGN luminosity). Feature attribution identifies specific optical lines (Ca II, Sr II, Fe I, [O II], Hα) as responsible, pointing to deficiencies in modeling star-formation chronology and chemical enrichment.

Significance. If the central claim holds after controls for selection effects, the result would be significant for galaxy evolution studies: it provides empirical evidence that optical-IR correlations are stronger than assumed in separable-component SED models and supplies a practical route to improve panchromatic predictions and constrain dust/AGN properties from optical data alone. The identification of specific lines offers a concrete target for refining physical models of star-formation history.

major comments (3)
  1. [Abstract/Methods] Abstract and Methods (training/validation description): The reported χ²_N ≈1 on held-out data is presented without any description of the neural-network architecture, loss function, training/validation splits, sample selection criteria, or error analysis. This omission is load-bearing because the skeptic concern—that the model may be learning joint SDSS/WISE selection functions or redshift-dependent completeness rather than causal physical correlations—cannot be evaluated without these details.
  2. [Results (feature attribution)] Results (feature attribution section): The post-hoc attribution to lines such as Ca II, [O II], and Hα is offered as evidence for physical star-formation chronology, but no explicit control experiments (e.g., redshift-matched subsamples, magnitude-limited mocks, or ablation of fiber-aperture effects) are reported. Without these, the attribution does not rule out that the network is capturing survey selection rather than the claimed physical links.
  3. [SED comparison] SED comparison (bias quantification): The claim that SED-fitting predictions are biased and overconfident is central, yet the manuscript provides no quantitative breakdown of the bias (e.g., systematic offsets in log L_bol or q_PAH as a function of redshift or stellar mass) or the size of the comparison sample. This weakens the assertion that model misspecification, rather than training-data mismatch, is the dominant cause.
minor comments (2)
  1. [Abstract] Notation: χ²_N is used without an explicit definition of the normalization (e.g., per degree of freedom or per band) in the abstract or early sections.
  2. [Figures] Figure clarity: The color-color or residual plots comparing model predictions to WISE data would benefit from explicit indication of the 1:1 line and error bars on the neural-summary predictions.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed comments, which have helped us improve the clarity and robustness of the manuscript. We address each major comment point-by-point below. Revisions have been made to incorporate additional methodological details, control analyses, and quantitative bias assessments where feasible.

read point-by-point responses
  1. Referee: [Abstract/Methods] Abstract and Methods (training/validation description): The reported χ²_N ≈1 on held-out data is presented without any description of the neural-network architecture, loss function, training/validation splits, sample selection criteria, or error analysis. This omission is load-bearing because the skeptic concern—that the model may be learning joint SDSS/WISE selection functions or redshift-dependent completeness rather than causal physical correlations—cannot be evaluated without these details.

    Authors: We agree that the original manuscript omitted sufficient detail on these aspects, which limits evaluation of potential selection effects. In the revised version, the Methods section has been substantially expanded to describe the neural network architecture (a multi-layer perceptron with specific hidden layer dimensions and ReLU activations), the loss function (weighted mean squared error on normalized fluxes), the training/validation split (stratified 80/10/10 train/validation/test with redshift binning), sample selection (SDSS DR16 spectra cross-matched to WISE with quality flags, redshift 0.01 < z < 0.3, magnitude cuts), and error analysis (including bootstrap resampling for uncertainty estimates). These additions directly address the concern about distinguishing physical correlations from selection functions. revision: yes

  2. Referee: [Results (feature attribution)] Results (feature attribution section): The post-hoc attribution to lines such as Ca II, [O II], and Hα is offered as evidence for physical star-formation chronology, but no explicit control experiments (e.g., redshift-matched subsamples, magnitude-limited mocks, or ablation of fiber-aperture effects) are reported. Without these, the attribution does not rule out that the network is capturing survey selection rather than the claimed physical links.

    Authors: This is a valid concern, as the original feature attribution (via integrated gradients) did not include explicit controls. We have added a new subsection in the revised manuscript presenting results from redshift-binned subsamples (showing persistent line importance across z ranges) and checks on fiber aperture effects using aperture-corrected spectra. These controls support that the attributions reflect physical links to star-formation chronology and chemical enrichment rather than pure selection. However, we note that fully exhaustive mock simulations of survey selection were not performed and would require additional computational resources beyond the scope of this work. revision: partial

  3. Referee: [SED comparison] SED comparison (bias quantification): The claim that SED-fitting predictions are biased and overconfident is central, yet the manuscript provides no quantitative breakdown of the bias (e.g., systematic offsets in log L_bol or q_PAH as a function of redshift or stellar mass) or the size of the comparison sample. This weakens the assertion that model misspecification, rather than training-data mismatch, is the dominant cause.

    Authors: We agree that the original manuscript lacked a quantitative breakdown and explicit sample size for the SED comparison. The revised manuscript now includes the comparison sample size (N ≈ 12,500 galaxies with both neural predictions and CIGALE SED fits) and new figures/tables quantifying median systematic offsets in log L_bol(AGN) and q_PAH as functions of redshift and stellar mass. These show correlated biases in SFR and AGN luminosity that are inconsistent with simple training-data mismatch and instead point to model misspecification in separable-component SED modeling. revision: yes

Circularity Check

0 steps flagged

No circularity: standard supervised ML prediction on held-out data

full rationale

The paper trains a neural summary of SDSS optical spectra to predict WISE IR photometry and reports χ²_N ≈1 on (implicitly held-out) data. No equations or steps reduce predictions to fitted inputs by construction; no self-citations are load-bearing for the central claim; feature attribution to lines such as CaII, [OII], Hα is post-hoc analysis rather than a definitional loop. The model is a data-driven regressor whose performance is externally falsifiable on survey photometry, satisfying the criteria for a self-contained result.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on a trained neural network whose parameters are fitted to SDSS-WISE data and on the domain assumption that physical processes create exploitable optical-IR correlations; no new physical entities are postulated.

free parameters (1)
  • neural network parameters
    Weights and biases of the neural summary model are fitted to achieve the reported prediction accuracy on the training data.
axioms (1)
  • domain assumption Subtle correlations exist between optical spectral features and infrared emission properties due to shared physical processes in galaxies
    This assumption underpins the ability of the neural summary to predict WISE photometry from SDSS spectra.

pith-pipeline@v0.9.0 · 5755 in / 1332 out tokens · 51941 ms · 2026-05-23T01:00:48.402397+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. DeepDive: Simultaneous Formation of Massive Quiescent Galaxies in High-Redshift Galaxy Proto-clusters

    astro-ph.GA 2026-04 unverdicted novelty 6.0

    JWST data show massive quiescent galaxies in high-redshift proto-clusters formed and quenched simultaneously, with AGN signatures, indicating environmental triggering of quenching.

  2. Winding Back the Clock: Recent Star Formation Histories of Massive Quiescent Galaxies Are Consistent With Their Rapid Number Density Evolution Since $\mathbf{z\sim7}$

    astro-ph.GA 2026-04 conditional novelty 6.0

    Star formation histories inferred for z=2-5 massive quiescent galaxies imply past number densities that align with observed rapid evolution since z~7.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · cited by 2 Pith papers · 2 internal anchors

  1. [1]

    Agarap, A. F. 2019, Deep Learning using Rectified Linear Units (ReLU). https://arxiv.org/abs/1803.08375 Alsing, J., Thorp, S., Deger, S., et al. 2024, ApJS, 274, 12, doi: 10.3847/1538-4365/ad5c69 Andrae, R., Schulze-Hartung, T., & Melchior, P. 2010, arXiv e-prints, arXiv:1012.3754, doi: 10.48550/arXiv.1012.3754 Astropy Collaboration, Price-Whelan, A. M., ...

  2. [2]

    http://jmlr.org/papers/v12/pedregosa11a.html Piantadosi, S. T. 2018, AIP Advances, 8 Pillepich, A., Nelson, D., Hernquist, L., et al. 2018, MNRAS, 475, 648, doi: 10.1093/mnras/stx3112 Portillo, S. K. N., Parejko, J. K., Vergara, J. R., & Connolly, A. J. 2020, AJ, 160, 45, doi: 10.3847/1538-3881/ab9644 Roach, F. E., & Gordon, J. L. 1973, The Light of the N...

  3. [3]

    This shows that the predictions are not biased at any specific bolometric luminosity

    (left) CIGALE fitted bolometric luminosity versus predicted bolometric luminosity. This shows that the predictions are not biased at any specific bolometric luminosity. (middle) XMM X-ray luminosity versus CIGALE fitted bolometric luminosity. Note that the X-ray’s are not used for fitting. (right) XMM X-ray luminosity versus predicted bolometric luminosit...

  4. [4]

    Almost all (> 99%) of sources are matched to within half an arcsecond, an excellent crossmatching accuracy

    The astrometric distance match between WISE and SDSS sources in logarithmically colored bins. Almost all (> 99%) of sources are matched to within half an arcsecond, an excellent crossmatching accuracy. Only a few sources are outside the 2 arcsecond limit. This is only done for very bright/extended sources with no other possible match nearby. B. ADDITIONAL...

  5. [5]

    We furthermore require there to be no extended sources, since extended X-ray sources are unlikely to be AGN, but instead hot intra- cluster gas

    to select all sources within a common radius of 10”, which we then later restrict further to 5”. We furthermore require there to be no extended sources, since extended X-ray sources are unlikely to be AGN, but instead hot intra- cluster gas. The used XMM-Newton catalogue is the XMM-Newton serendipitous survey 4XMM-DR14 catalogue (https://heasarc.gsfc.nasa...

  6. [6]

    The non-parametric SFH model is detailed in Leja et al

    prospector utilizes a non-parametric SFH model, dividing the galaxy’s history into distinct time bins. The non-parametric SFH model is detailed in Leja et al. (2019a). The emission from the galaxy is shielded by dust. The dust absorption is parametrized by three parameters. The two first are the diffuse dust optical depth, which quantifies the attenuation...

  7. [7]

    and is governed by f AGN, the ratio of the active galactic nucleus (AGN) bolometric luminosity to the total (AGN plus stellar) luminosity, and the optical depth of the dust torus surrounding the AGN, τAGN. F. SCIENCE WITH LATENT SPACES AND OVERPARAMETRIZATION OF MODELS Here we discuss some more tentative aspects of our work. We will discuss the implicatio...

  8. [8]

    Von Neumann’s Elephant

    and the star-forming main sequence (Speagle et al. 2014). The dimensionality of the spender latent space is six, which naively would suggest that the fundamental hyperplane has the same dimensionality. Although this enticingly suggests that not only cosmology, but also galaxy evolution, is nothing but a search for six numbers, one must be cautious when in...

  9. [9]

    The performance is still significantly better than that of SED fitting codes, but a similarity emerges, especially in the width of the distributions

    The performance is an order of magnitude worse when using only the photometry, compared to using the full spectrum. The performance is still significantly better than that of SED fitting codes, but a similarity emerges, especially in the width of the distributions. galaxy models are vastly overparametrized. The historical decoupling of parameters that sho...

  10. [10]

    We can therefore conclude that optical photometry is insufficient to predict IR properties of galaxies. This, in addition to other reasons discussed in §5, could also be one of the reasons for the poor performance of SED-modelling codes, since these codes are almost universally developed with a focus on photometry. It is important to stress that the diffe...

  11. [11]

    For all options, we show the highest/lowest 10 percent of the quantity in question

    χ-distributions of our predictions split by brightness in the IR/optical, mass, redshift, SFR and S/N. For all options, we show the highest/lowest 10 percent of the quantity in question. See the text for an expanded description of each quantity. • (top right) The mass, as estimated using the methods of Bilicki et al. (2014) and Cluver et al. (2014). • (mi...

  12. [12]

    However, despite the χ-distribution metrics, the model does not extrapolate well beyond the fitted wavelength range

    Our results demonstrate that the fits are generally very good, as the χ-distributions align almost perfectly with these ideal expectations. However, despite the χ-distribution metrics, the model does not extrapolate well beyond the fitted wavelength range. This discrepancy suggests that the issue lies not in the model’s flexibility or its ability to fit t...