The Optical and Infrared Are Connected
Pith reviewed 2026-05-23 01:00 UTC · model grok-4.3
The pith
A neural summary of optical SDSS spectra predicts WISE infrared photometry to χ²_N ≈1 accuracy by capturing physical correlations across wavelengths.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that optical and infrared galaxy light are tightly linked through shared physical processes, so that a compact neural summary of an SDSS optical spectrum alone is sufficient to predict WISE photometry at χ²_N ≈1 for every band; the same summary also constrains AGN luminosities and dust parameters, while existing SED models produce biased and overconfident estimates because they do not reproduce these panchromatic relations.
What carries the argument
A neural-network encoder that produces a low-dimensional summary of an optical spectrum and is trained to regress directly onto WISE infrared fluxes, thereby learning the cross-wavelength correlations without explicit physical templates.
If this is right
- AGN bolometric luminosities and dust parameters such as q_PAH can be inferred from optical spectra alone at the precision previously requiring infrared data.
- Current SED-fitting pipelines return systematically biased star-formation rates and AGN luminosities because they miss the optical-infrared correlations.
- The lines Ca II, Sr II, Fe I, [O II], and Hα are the dominant contributors to the infrared prediction, indicating that star-formation chronology is the key missing ingredient in existing models.
- Panchromatic galaxy properties can be recovered without separate infrared observations once the optical-infrared mapping is learned from data.
Where Pith is reading between the lines
- The same encoder approach could be used to predict far-infrared or radio fluxes from optical spectra, testing whether the correlations extend to other wavelengths.
- If the identified lines are added as explicit constraints in future SED codes, the biases in star-formation rate and AGN luminosity should decrease.
- Surveys that obtain only optical spectra could still produce infrared-derived catalogs by applying the trained model, provided the training distribution matches the target population.
Load-bearing premise
The neural summary extracts the physical correlations that link optical features to infrared emission rather than simply memorizing survey-specific selection effects or noise patterns in the training data.
What would settle it
Apply the trained model to a completely independent sample of galaxies that have both SDSS spectra and WISE photometry but were never seen during training; if the χ²_N values remain near 1 and the recovered AGN and dust parameters agree with direct infrared measurements, the claim holds.
Figures
read the original abstract
Galaxies are often modelled as composites of separable components with distinct spectral signatures, implying that different wavelength ranges are only weakly correlated. They are not. We present a data-driven model which exploits subtle correlations between physical processes to accurately predict infrared (IR) WISE photometry from a neural summary of optical SDSS spectra. The model achieves accuracies of $\chi^2_N \approx 1$ for all photometric bands in WISE, as well as good colors. We are able to tightly constrain typically IR-derived properties, e.g., the bolometric luminosities of AGN and dust parameters such as $\mathrm{q_{PAH}}$. We also test whether current SED-fitting methods reproduce such panchromatic relations, but find their predictions biased and overconfident, likely due to model misspecification, with correlated biases in star-formation rates and AGN luminosities being most evident. To help improve SED models, we determine which features of the optical spectrum are responsible for our improved predictions, and identify several lines (CaII, SrII, FeI, [OII] and H$\alpha$), which point to the complex chronology of star formation and chemical enrichment being incorrectly modelled.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a data-driven neural network model that uses a neural summary of optical SDSS spectra to predict WISE infrared photometry, achieving normalized chi-squared accuracies of approximately 1 across bands and good colors. It claims this exploits physical correlations to tightly constrain IR-derived quantities such as AGN bolometric luminosities and dust parameters like q_PAH, while demonstrating that standard SED-fitting methods yield biased and overconfident predictions (particularly in SFR and AGN luminosity). Feature attribution identifies specific optical lines (Ca II, Sr II, Fe I, [O II], Hα) as responsible, pointing to deficiencies in modeling star-formation chronology and chemical enrichment.
Significance. If the central claim holds after controls for selection effects, the result would be significant for galaxy evolution studies: it provides empirical evidence that optical-IR correlations are stronger than assumed in separable-component SED models and supplies a practical route to improve panchromatic predictions and constrain dust/AGN properties from optical data alone. The identification of specific lines offers a concrete target for refining physical models of star-formation history.
major comments (3)
- [Abstract/Methods] Abstract and Methods (training/validation description): The reported χ²_N ≈1 on held-out data is presented without any description of the neural-network architecture, loss function, training/validation splits, sample selection criteria, or error analysis. This omission is load-bearing because the skeptic concern—that the model may be learning joint SDSS/WISE selection functions or redshift-dependent completeness rather than causal physical correlations—cannot be evaluated without these details.
- [Results (feature attribution)] Results (feature attribution section): The post-hoc attribution to lines such as Ca II, [O II], and Hα is offered as evidence for physical star-formation chronology, but no explicit control experiments (e.g., redshift-matched subsamples, magnitude-limited mocks, or ablation of fiber-aperture effects) are reported. Without these, the attribution does not rule out that the network is capturing survey selection rather than the claimed physical links.
- [SED comparison] SED comparison (bias quantification): The claim that SED-fitting predictions are biased and overconfident is central, yet the manuscript provides no quantitative breakdown of the bias (e.g., systematic offsets in log L_bol or q_PAH as a function of redshift or stellar mass) or the size of the comparison sample. This weakens the assertion that model misspecification, rather than training-data mismatch, is the dominant cause.
minor comments (2)
- [Abstract] Notation: χ²_N is used without an explicit definition of the normalization (e.g., per degree of freedom or per band) in the abstract or early sections.
- [Figures] Figure clarity: The color-color or residual plots comparing model predictions to WISE data would benefit from explicit indication of the 1:1 line and error bars on the neural-summary predictions.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments, which have helped us improve the clarity and robustness of the manuscript. We address each major comment point-by-point below. Revisions have been made to incorporate additional methodological details, control analyses, and quantitative bias assessments where feasible.
read point-by-point responses
-
Referee: [Abstract/Methods] Abstract and Methods (training/validation description): The reported χ²_N ≈1 on held-out data is presented without any description of the neural-network architecture, loss function, training/validation splits, sample selection criteria, or error analysis. This omission is load-bearing because the skeptic concern—that the model may be learning joint SDSS/WISE selection functions or redshift-dependent completeness rather than causal physical correlations—cannot be evaluated without these details.
Authors: We agree that the original manuscript omitted sufficient detail on these aspects, which limits evaluation of potential selection effects. In the revised version, the Methods section has been substantially expanded to describe the neural network architecture (a multi-layer perceptron with specific hidden layer dimensions and ReLU activations), the loss function (weighted mean squared error on normalized fluxes), the training/validation split (stratified 80/10/10 train/validation/test with redshift binning), sample selection (SDSS DR16 spectra cross-matched to WISE with quality flags, redshift 0.01 < z < 0.3, magnitude cuts), and error analysis (including bootstrap resampling for uncertainty estimates). These additions directly address the concern about distinguishing physical correlations from selection functions. revision: yes
-
Referee: [Results (feature attribution)] Results (feature attribution section): The post-hoc attribution to lines such as Ca II, [O II], and Hα is offered as evidence for physical star-formation chronology, but no explicit control experiments (e.g., redshift-matched subsamples, magnitude-limited mocks, or ablation of fiber-aperture effects) are reported. Without these, the attribution does not rule out that the network is capturing survey selection rather than the claimed physical links.
Authors: This is a valid concern, as the original feature attribution (via integrated gradients) did not include explicit controls. We have added a new subsection in the revised manuscript presenting results from redshift-binned subsamples (showing persistent line importance across z ranges) and checks on fiber aperture effects using aperture-corrected spectra. These controls support that the attributions reflect physical links to star-formation chronology and chemical enrichment rather than pure selection. However, we note that fully exhaustive mock simulations of survey selection were not performed and would require additional computational resources beyond the scope of this work. revision: partial
-
Referee: [SED comparison] SED comparison (bias quantification): The claim that SED-fitting predictions are biased and overconfident is central, yet the manuscript provides no quantitative breakdown of the bias (e.g., systematic offsets in log L_bol or q_PAH as a function of redshift or stellar mass) or the size of the comparison sample. This weakens the assertion that model misspecification, rather than training-data mismatch, is the dominant cause.
Authors: We agree that the original manuscript lacked a quantitative breakdown and explicit sample size for the SED comparison. The revised manuscript now includes the comparison sample size (N ≈ 12,500 galaxies with both neural predictions and CIGALE SED fits) and new figures/tables quantifying median systematic offsets in log L_bol(AGN) and q_PAH as functions of redshift and stellar mass. These show correlated biases in SFR and AGN luminosity that are inconsistent with simple training-data mismatch and instead point to model misspecification in separable-component SED modeling. revision: yes
Circularity Check
No circularity: standard supervised ML prediction on held-out data
full rationale
The paper trains a neural summary of SDSS optical spectra to predict WISE IR photometry and reports χ²_N ≈1 on (implicitly held-out) data. No equations or steps reduce predictions to fitted inputs by construction; no self-citations are load-bearing for the central claim; feature attribution to lines such as CaII, [OII], Hα is post-hoc analysis rather than a definitional loop. The model is a data-driven regressor whose performance is externally falsifiable on survey photometry, satisfying the criteria for a self-contained result.
Axiom & Free-Parameter Ledger
free parameters (1)
- neural network parameters
axioms (1)
- domain assumption Subtle correlations exist between optical spectral features and infrared emission properties due to shared physical processes in galaxies
Forward citations
Cited by 2 Pith papers
-
DeepDive: Simultaneous Formation of Massive Quiescent Galaxies in High-Redshift Galaxy Proto-clusters
JWST data show massive quiescent galaxies in high-redshift proto-clusters formed and quenched simultaneously, with AGN signatures, indicating environmental triggering of quenching.
-
Winding Back the Clock: Recent Star Formation Histories of Massive Quiescent Galaxies Are Consistent With Their Rapid Number Density Evolution Since $\mathbf{z\sim7}$
Star formation histories inferred for z=2-5 massive quiescent galaxies imply past number densities that align with observed rapid evolution since z~7.
Reference graph
Works this paper leans on
-
[1]
Agarap, A. F. 2019, Deep Learning using Rectified Linear Units (ReLU). https://arxiv.org/abs/1803.08375 Alsing, J., Thorp, S., Deger, S., et al. 2024, ApJS, 274, 12, doi: 10.3847/1538-4365/ad5c69 Andrae, R., Schulze-Hartung, T., & Melchior, P. 2010, arXiv e-prints, arXiv:1012.3754, doi: 10.48550/arXiv.1012.3754 Astropy Collaboration, Price-Whelan, A. M., ...
work page internal anchor Pith review Pith/arXiv arXiv doi:10.3847/1538-4365/ad5c69 2019
-
[2]
http://jmlr.org/papers/v12/pedregosa11a.html Piantadosi, S. T. 2018, AIP Advances, 8 Pillepich, A., Nelson, D., Hernquist, L., et al. 2018, MNRAS, 475, 648, doi: 10.1093/mnras/stx3112 Portillo, S. K. N., Parejko, J. K., Vergara, J. R., & Connolly, A. J. 2020, AJ, 160, 45, doi: 10.3847/1538-3881/ab9644 Roach, F. E., & Gordon, J. L. 1973, The Light of the N...
work page internal anchor Pith review doi:10.1093/mnras/stx3112 2018
-
[3]
This shows that the predictions are not biased at any specific bolometric luminosity
(left) CIGALE fitted bolometric luminosity versus predicted bolometric luminosity. This shows that the predictions are not biased at any specific bolometric luminosity. (middle) XMM X-ray luminosity versus CIGALE fitted bolometric luminosity. Note that the X-ray’s are not used for fitting. (right) XMM X-ray luminosity versus predicted bolometric luminosit...
work page 2024
-
[4]
The astrometric distance match between WISE and SDSS sources in logarithmically colored bins. Almost all (> 99%) of sources are matched to within half an arcsecond, an excellent crossmatching accuracy. Only a few sources are outside the 2 arcsecond limit. This is only done for very bright/extended sources with no other possible match nearby. B. ADDITIONAL...
work page 2016
-
[5]
to select all sources within a common radius of 10”, which we then later restrict further to 5”. We furthermore require there to be no extended sources, since extended X-ray sources are unlikely to be AGN, but instead hot intra- cluster gas. The used XMM-Newton catalogue is the XMM-Newton serendipitous survey 4XMM-DR14 catalogue (https://heasarc.gsfc.nasa...
work page 2013
-
[6]
The non-parametric SFH model is detailed in Leja et al
prospector utilizes a non-parametric SFH model, dividing the galaxy’s history into distinct time bins. The non-parametric SFH model is detailed in Leja et al. (2019a). The emission from the galaxy is shielded by dust. The dust absorption is parametrized by three parameters. The two first are the diffuse dust optical depth, which quantifies the attenuation...
work page 2000
-
[7]
and is governed by f AGN, the ratio of the active galactic nucleus (AGN) bolometric luminosity to the total (AGN plus stellar) luminosity, and the optical depth of the dust torus surrounding the AGN, τAGN. F. SCIENCE WITH LATENT SPACES AND OVERPARAMETRIZATION OF MODELS Here we discuss some more tentative aspects of our work. We will discuss the implicatio...
work page 2024
-
[8]
and the star-forming main sequence (Speagle et al. 2014). The dimensionality of the spender latent space is six, which naively would suggest that the fundamental hyperplane has the same dimensionality. Although this enticingly suggests that not only cosmology, but also galaxy evolution, is nothing but a search for six numbers, one must be cautious when in...
work page 2014
-
[9]
The performance is an order of magnitude worse when using only the photometry, compared to using the full spectrum. The performance is still significantly better than that of SED fitting codes, but a similarity emerges, especially in the width of the distributions. galaxy models are vastly overparametrized. The historical decoupling of parameters that sho...
work page 2020
-
[10]
We can therefore conclude that optical photometry is insufficient to predict IR properties of galaxies. This, in addition to other reasons discussed in §5, could also be one of the reasons for the poor performance of SED-modelling codes, since these codes are almost universally developed with a focus on photometry. It is important to stress that the diffe...
work page 2023
-
[11]
For all options, we show the highest/lowest 10 percent of the quantity in question
χ-distributions of our predictions split by brightness in the IR/optical, mass, redshift, SFR and S/N. For all options, we show the highest/lowest 10 percent of the quantity in question. See the text for an expanded description of each quantity. • (top right) The mass, as estimated using the methods of Bilicki et al. (2014) and Cluver et al. (2014). • (mi...
work page 2014
-
[12]
Our results demonstrate that the fits are generally very good, as the χ-distributions align almost perfectly with these ideal expectations. However, despite the χ-distribution metrics, the model does not extrapolate well beyond the fitted wavelength range. This discrepancy suggests that the issue lies not in the model’s flexibility or its ability to fit t...
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.