pith. machine review for the scientific record. sign in

arxiv: 2604.25786 · v1 · submitted 2026-04-28 · 🌌 astro-ph.GA · astro-ph.IM

Recognition: unknown

Homogeneous Stellar Parameters from Heterogeneous Spectra with Deep Learning

Jeff Shen, Joshua S. Speagle, Shirley Ho

Pith reviewed 2026-05-07 15:40 UTC · model grok-4.3

classification 🌌 astro-ph.GA astro-ph.IM
keywords deep learningstellar spectroscopychemical abundancesGalactic archaeologyTransformer modelhomogeneous parametersspectroscopic surveysMilky Way
0
0 comments X

The pith

A Transformer model trained jointly on spectra from multiple surveys produces consistent stellar parameters, 20 chemical abundances, distances, and ages on one scale.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Large spectroscopic surveys derive stellar properties with independent pipelines that introduce systematic offsets, which obscure chemical patterns and complicate studies of the Milky Way's history. This paper develops a single deep-learning system using a Transformer that accepts spectra of any wavelength range and resolution and outputs atmospheric parameters, abundances for twenty elements, distances, and ages referenced to the same scale. The model trains end-to-end across all surveys together, removing the need for later adjustments between datasets. A sympathetic reader would value this because it supplies a uniform foundation for combining millions of observations into reliable maps of galactic formation and evolution.

Core claim

The authors introduce a unified deep-learning framework that employs a Transformer neural network to ingest stellar spectra of arbitrary wavelength coverage and spectral resolution. Trained as one end-to-end model on data from several surveys simultaneously, the network outputs effective temperature, surface gravity, iron abundance, abundances for twenty additional elements, distances, and ages, all placed on a single self-consistent scale without post-hoc recalibration. Results maintain consistency for the same stars observed across different surveys and match external validation sets such as distance catalogs and open-cluster properties.

What carries the argument

An end-to-end trained Transformer model that ingests spectra of arbitrary wavelength range and resolution and is trained jointly across heterogeneous surveys to enforce output consistency.

Load-bearing premise

That a single model trained on mixed survey data can remove all systematic differences between independent analysis pipelines while preserving true physical stellar properties and without adding new survey-specific biases.

What would settle it

If stars observed by more than one survey receive derived parameters that differ by amounts larger than the model's stated uncertainties, the claim of cross-survey homogeneity would be falsified.

Figures

Figures reproduced from arXiv: 2604.25786 by Jeff Shen, Joshua S. Speagle, Shirley Ho.

Figure 1
Figure 1. Figure 1: Brief overview of the model architecture. The en￾coder creates homogeneous, wavelength-aware embeddings from heterogeneous input spectra; these embeddings are then pooled and mapped to stellar labels. Figure adapted from J. Shen et al. (2025). The architecture of our model is shown in view at source ↗
Figure 2
Figure 2. Figure 2: Ground truth label values (x-axes) vs predicted label values (y-axes) for atmospheric parameters, chemical abundances, distances, and stellar ages, aggregated across all samples in the validation dataset and coloured by log density (red indicates more samples in that region, and blue indicates fewer samples). Black lines indicate 1-to-1 relations; in general the model predictions are very tightly clustered… view at source ↗
Figure 3
Figure 3. Figure 3: Same as view at source ↗
Figure 4
Figure 4. Figure 4: Residuals (x-axis) vs total model uncertainty, aggregated across all samples in the validation dataset and coloured by log density (red indicates more samples in that region, and blue indicates fewer samples). Across all labels, the predictive uncertainty from the model is positively correlated with actual residuals. 0 2 4 lo g g [d e x] 10 0 10 2 Count 50 100 Teff [K] 6000 5000 4000 Teff [K] 0 2 4 lo g g … view at source ↗
Figure 5
Figure 5. Figure 5: Model uncertainty across parameter space. The top-left panel shows a Kiel diagram coloured by sample density (red indi￾cates fewer samples), while the remaining panels show the same diagram coloured by the predicted uncertainty in Teff (top right), log g (bottom left), and [Fe/H] (bottom right). Note that the colour scale for sample density is inverted relative to the uncertainty panels to facilitate compa… view at source ↗
Figure 6
Figure 6. Figure 6: PIT histograms for all labels, broken down by survey. A perfectly calibrated model would produce a uniform distribution (dashed black line). A U-shaped histogram indicates underestimated uncertainties, while an inverted-U shape indicates overestimated uncertainties. cially for [Fe/H] and [Mg/Fe]. This is unsurprising given that the reference label scale is derived from APOGEE, which mainly targets red gian… view at source ↗
Figure 7
Figure 7. Figure 7: Calibration quality of predicted uncertainties across Kiel diagram for selected labels, aggregated across all input surveys. Colour indicates ln DKS, the logarithm of the Kolmogorov–Smirnov statistic of the PIT distribution against a uniform reference distribution. More negative values (darker blue) indicate better calibrated uncertainties. Label Gaia RVS APOGEE GALAH DESI Teff [K] 56.2 18.2 52.7 51.0 log … view at source ↗
Figure 8
Figure 8. Figure 8: Cross-survey consistency check. Each off-diagonal panel compares predicted labels for stars observed by two different surveys. Panels above/to the right of the diagonal show Teff , and panels below/to the left show log g. The diagonal labels indicate the corresponding survey. All panels are coloured by the larger of the two model-predicted uncertainties. We use the T. Cantat-Gaudin et al. (2018) and L. Spi… view at source ↗
Figure 9
Figure 9. Figure 9: Same as view at source ↗
Figure 10
Figure 10. Figure 10: Comparison of cluster metallicities from U. Heiter et al. (2014), R. Carrera et al. (2019), L. Spina et al. (2021), and W. S. Dias et al. (2021) to the median [Fe/H] estimated from our catalog for member stars in each cluster. Error bars on our estimates indicate the standard deviation of our per-star [Fe/H] estimates within each cluster. Only clusters with more than 5 members in the cross-match with our … view at source ↗
Figure 11
Figure 11. Figure 11: Comparison of our predicted distances to three external catalogs: D. W. Hogg et al. (2019) (left), S. Li et al. (2025) (centre), and A. B. A. Queiroz et al. (2023) (right). Each panel shows a hexbin density plot of reference distance vs. predicted distance in log–log space, coloured by number count. The solid line marks the 1-to-1 relation, and the dashed line is the best fit linear model log dref = a log… view at source ↗
Figure 12
Figure 12. Figure 12: Comparison of cluster ages from D. Bossini et al. (2019), W. S. Dias et al. (2021), L. Spina et al. (2021), and Y. Tarricq et al. (2021) to the median age estimated from our catalog for member stars in each cluster. Errorbars on our estimates indicate the standard deviation of our per-star age estimates within each cluster. Only clusters with more than 5 members in the cross-match with our catalog are sho… view at source ↗
Figure 13
Figure 13. Figure 13: Same as view at source ↗
Figure 14
Figure 14. Figure 14: Comparison of our predicted stellar parameters from GALAH spectra to the values reported by the GALAH pipeline for Teff (top left), log g (top right), [Fe/H] (bottom left), and [Mg/Fe] (bottom right). Quality cuts have been applied to both sets of labels (see text). The black line indicates the 1-to-1 relation. Systematic differences between the two pipelines are visible, including a slope difference in T… view at source ↗
read the original abstract

Large-scale spectroscopic surveys have collectively observed millions of stars across the Milky Way, but each derives stellar labels using independent pipelines with distinct modelling assumptions, introducing systematic offsets that obscure signals in chemical space and hinder large-scale Galactic archaeology. We present a unified deep-learning framework that delivers atmospheric parameters, chemical abundances for 20 elements, distances, and ages -- all on a single, self-consistent scale -- for an arbitrary number of spectroscopic surveys simultaneously. Our approach uses a Transformer model that ingests spectra of arbitrary wavelength range and resolution, trained end-to-end as a single model across all surveys, eliminating the need for post-hoc recalibration. We apply this framework to spectra from APOGEE DR17, GALAH DR3, DESI DR1, and $\textit{Gaia}$ RVS DR3, spanning resolutions from R ~ 2,000 to 28,000 and wavelengths from the optical to the near-infrared. On high-resolution APOGEE spectra the model achieves precisions of $18~$K in $\textrm{T}_{\rm eff}$, $0.04~$dex in $\textrm{log}\,\textit{g}$, $0.015~$dex in [Fe/H], and ${<}\,0.03~$dex across all abundances; on lower-resolution DESI spectra, typical precisions are $51~$K, $0.09~$dex, $0.04~$dex, and ${\sim}\,0.06~$dex, respectively. Cross-survey comparisons demonstrate that labels for the same stars observed by different surveys are consistent within model uncertainties; we further validate against external distance catalogs and open cluster metallicities and ages. The resulting homogeneous catalog enables Galactic archaeology at unprecedented scale and consistency, and the framework is readily extensible to forthcoming spectroscopic surveys such as SDSS-V, WEAVE, and 4MOST. The catalog is publicly available at https://doi.org/10.5281/zenodo.19830515.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript claims to introduce a single end-to-end Transformer model trained jointly on spectra from APOGEE DR17, GALAH DR3, DESI DR1, and Gaia RVS DR3 that produces homogeneous stellar atmospheric parameters (Teff, log g, [Fe/H]), abundances for 20 elements, distances, and ages on one self-consistent scale. The model handles arbitrary wavelength ranges and resolutions, reports precisions such as 18 K in Teff, 0.04 dex in log g, and 0.015 dex in [Fe/H] on APOGEE data (with lower values on DESI), demonstrates cross-survey consistency for overlapping stars within uncertainties, and validates against external distance catalogs and open-cluster properties. The resulting catalog is released publicly.

Significance. If the central claim of true homogeneity holds, the work would be significant for Galactic archaeology by enabling large-scale, bias-free combination of spectroscopic data without post-hoc recalibrations. The flexible Transformer architecture and public catalog release are strengths that support extensibility to future surveys such as SDSS-V and WEAVE.

major comments (2)
  1. [Training procedure (abstract and methods)] The training procedure (described in the abstract and methods) uses pipeline-derived labels as supervision. For the thousands of multi-survey overlap stars, each spectrum supplies a different target vector, producing contradictory gradients in the loss. The abstract reports only that post-training predictions agree within model uncertainties; this shows convergence to a compromise but does not demonstrate that the compromise is closer to physical truth than the original pipelines or free of new survey-dependent residuals. No consistency regularizer, label pre-alignment, or multi-task invariance term is described, which is load-bearing for the homogeneity claim.
  2. [Results and validation sections] In the results and validation sections, the reported precisions (e.g., 18 K in Teff on APOGEE, 51 K on DESI) and cross-survey consistency are presented without explicit baseline comparisons to the individual survey pipelines or quantitative assessment of whether the unified model reduces systematic offsets relative to those pipelines. External validations against distance catalogs and open clusters are mentioned but lack details on error propagation and statistical tests for improvement, weakening support for the claim that the scale is demonstrably superior and homogeneous.
minor comments (2)
  1. [Abstract] The abstract would benefit from stating the total number of stars in the final catalog and the fraction of multi-survey overlaps to contextualize the scale of the homogeneity achievement.
  2. [Notation and figures] Notation for effective temperature (T_eff vs. Teff) and abundance brackets should be standardized throughout the text and figures for clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful comments, which have helped clarify the presentation of our results. We address each major comment point by point below, with revisions indicated where we will strengthen the manuscript.

read point-by-point responses
  1. Referee: The training procedure (described in the abstract and methods) uses pipeline-derived labels as supervision. For the thousands of multi-survey overlap stars, each spectrum supplies a different target vector, producing contradictory gradients in the loss. The abstract reports only that post-training predictions agree within model uncertainties; this shows convergence to a compromise but does not demonstrate that the compromise is closer to physical truth than the original pipelines or free of new survey-dependent residuals. No consistency regularizer, label pre-alignment, or multi-task invariance term is described, which is load-bearing for the homogeneity claim.

    Authors: The single end-to-end Transformer is trained jointly across all surveys, allowing shared parameters to learn a common representation that reconciles spectral differences and label variations. For overlap stars the model produces predictions consistent within uncertainties, indicating that the optimization has identified a scale supported by the combined data rather than survey-specific artifacts. External validations against independent distance catalogs and open-cluster properties further support that this scale is physically meaningful. We did not add an explicit regularizer because the joint multi-survey training already enforces cross-survey invariance through the shared architecture. We will revise the methods section to include a dedicated discussion of the training dynamics, the role of overlap stars, and the limitations of relying on pipeline labels without additional regularization. revision: partial

  2. Referee: In the results and validation sections, the reported precisions (e.g., 18 K in Teff on APOGEE, 51 K on DESI) and cross-survey consistency are presented without explicit baseline comparisons to the individual survey pipelines or quantitative assessment of whether the unified model reduces systematic offsets relative to those pipelines. External validations against distance catalogs and open clusters are mentioned but lack details on error propagation and statistical tests for improvement, weakening support for the claim that the scale is demonstrably superior and homogeneous.

    Authors: We agree that explicit baseline comparisons and expanded validation details will strengthen the manuscript. In the revised version we will add quantitative comparisons of the unified model against the original pipeline labels for the same stars, including mean offsets and scatter reductions in the overlap samples. We will also expand the external-validation sections to describe uncertainty propagation from the model outputs to distances and ages, and include statistical tests (e.g., reduced chi-squared and Kolmogorov-Smirnov statistics) demonstrating improved consistency with open-cluster literature values relative to the individual pipelines. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper trains a single end-to-end Transformer on pipeline labels from APOGEE, GALAH, DESI, and Gaia RVS to produce a homogeneous catalog of stellar parameters. While supervision comes from the heterogeneous pipeline outputs, the central claims rest on the model's learned mapping from spectra to labels, with explicit cross-survey consistency checks and external validation against independent distance catalogs and open cluster metallicities/ages. No equation or step reduces the claimed self-consistent scale to the input labels by construction, nor does the derivation rely on self-citations, imported uniqueness theorems, or smuggled ansatzes. The framework is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim depends on the neural network learning a survey-invariant mapping from heterogeneous spectra; this rests on many fitted model parameters and the domain assumption that spectra encode consistent physical information across instruments.

free parameters (2)
  • Transformer architecture hyperparameters
    Number of layers, attention heads, hidden dimensions, and embedding sizes chosen and optimized during training to fit the multi-survey data.
  • Training optimization parameters
    Learning rate, batch size, loss weighting, and regularization terms fitted to achieve the reported precisions across surveys.
axioms (1)
  • domain assumption Spectra from different surveys contain overlapping information about the same underlying stellar physical properties despite differences in resolution and wavelength coverage.
    Invoked to justify that one model can produce consistent labels without survey-specific branches.

pith-pipeline@v0.9.0 · 5657 in / 1479 out tokens · 53628 ms · 2026-05-07T15:40:14.254519+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

81 extracted references · 79 canonical work pages · 10 internal anchors

  1. [1]

    C., Hinneburg, A., & Keim, D

    Aggarwal, C. C., Hinneburg, A., & Keim, D. A. 2001, in Proceedings of the 8th International Conference on Database

  2. [2]

    2023, A&A, 678, A158, doi: 10.1051/0004-6361/202346666 Astropy Collaboration, Robitaille, T

    Anders, F., Gispert, P., Ratcliffe, B., et al. 2023, A&A, 678, A158, doi: 10.1051/0004-6361/202346666 Astropy Collaboration, Robitaille, T. P., Tollerud, E. J., et al. 2013, A&A, 558, A33, doi: 10.1051/0004-6361/201322068 Astropy Collaboration, Price-Whelan, A. M., Sip˝ocz, B. M., et al. 2018, AJ, 156, 123, doi: 10.3847/1538-3881/aabc4f Astropy Collaborat...

  3. [3]

    Bensby, T., Feltzing, S., & Oey, M. S. 2014, A&A, 562, A71, doi: 10.1051/0004-6361/201322631

  4. [4]

    2025, A&A, 700, 160 doi: 10.1051/0004-6361/202555272

    Berni, L., Spina, L., Magrini, L., et al. 2025, A&A, 700, A160, doi: 10.1051/0004-6361/202555272

  5. [5]

    2019, A&A, 623, A108, doi: 10.1051/0004-6361/201834693

    Bossini, D., Vallenari, A., Bragaglia, A., et al. 2019, A&A, 623, A108, doi: 10.1051/0004-6361/201834693

  6. [6]

    THE CHEMICAL HOMOGENEITY OF OPEN CLUSTERS

    Bovy, J. 2016, The Astrophysical Journal, 817, 49, doi: 10.3847/0004-637x/817/1/49

  7. [7]

    2021, MNRAS, 506, 150, doi: 10.1093/mnras/stab1242

    Buder, S., Sharma, S., Kos, J., et al. 2021, Monthly Notices of the Royal Astronomical Society, 506, 150, doi: 10.1093/mnras/stab1242

  8. [8]

    , keywords =

    Cantat-Gaudin, T., Jordi, C., Vallenari, A., et al. 2018, Astronomy & Astrophysics, 618, A93, doi: 10.1051/0004-6361/201833476

  9. [9]

    2019, A&A, 623, A80, doi: 10.1051/0004-6361/201834546

    Carrera, R., Bragaglia, A., Cantat-Gaudin, T., et al. 2019, A&A, 623, A80, doi: 10.1051/0004-6361/201834546

  10. [10]

    Data Release 1 of the Dark Energy Spectroscopic Instrument

    Collaboration, D., Karim, M. A., Adame, A. G., et al. 2026, Data Release 1 of the Dark Energy Spectroscopic Instrument, https://arxiv.org/abs/2503.14745

  11. [11]

    P., Koposov, S

    Cooper, A. P., Koposov, S. E., Allende Prieto, C., et al. 2023, The Astrophysical Journal, 947, 37, doi: 10.3847/1538-4357/acb3c0

  12. [12]

    B., Zucker, D

    Das, P. B., Zucker, D. B., De Silva, G. M., et al. 2025, MNRAS, 538, 605, doi: 10.1093/mnras/staf169

  13. [13]

    Dawid, A. P. 1984, Journal of the Royal Statistical Society. Series A (General), 147, 278, doi: 10.2307/2981683

  14. [14]

    , keywords =

    Dawson, K. S., Schlegel, D. J., Ahn, C. P., et al. 2013, AJ, 145, 10, doi: 10.1088/0004-6256/145/1/10

  15. [15]

    , keywords =

    Dawson, K. S., Kneib, J.-P., Percival, W. J., et al. 2016, AJ, 151, 44, doi: 10.3847/0004-6256/151/2/44

  16. [16]

    , keywords =

    Dias, W. S., Monteiro, H., Moitinho, A., et al. 2021, MNRAS, 504, 356, doi: 10.1093/mnras/stab770

  17. [17]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. 2021, An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, https://arxiv.org/abs/2010.11929

  18. [18]

    A., O’Briain, T., et al

    Fabbro, S., Venn, K. A., O’Briain, T., et al. 2018, MNRAS, 475, 2978, doi: 10.1093/mnras/stx3298

  19. [19]

    W., 2018, @doi [ ] 10.3847/1538-4357/aadba5 , https://ui.adsabs.harvard.edu/#abs/2018ApJ...865...96F 865, 96

    Frankel, N., Rix, H. W., Ting, Y . S., Ness, M., & Hogg, D. W. 2018, arXiv, 865, 96, doi: 10.3847/1538-4357/aadba5

  20. [20]

    2019, The Astrophysical Journal, 884, 99, doi: 10.3847/1538-4357/ab4254

    Frankel, N., Sanders, J., Rix, H.-W., Ting, Y .-S., & Ness, M. 2019, The Astrophysical Journal, 884, 99, doi: 10.3847/1538-4357/ab4254

  21. [21]

    2020, ApJ, 896, 15, doi: 10.3847/1538-4357/ab910c

    Frankel, N., Sanders, J., Ting, Y .-S., & Rix, H.-W. 2020, The Astrophysical Journal, 896, 15, doi: 10.3847/1538-4357/ab910c

  22. [22]

    2002, ARA&A, 40, 487, doi: 10.1146/annurev.astro.40.060401.093840 Gaia Collaboration, Vallenari, A., Brown, A

    Freeman, K., & Bland-Hawthorn, J. 2002, ARA&A, 40, 487, doi: 10.1146/annurev.astro.40.060401.093840 Gaia Collaboration, Vallenari, A., Brown, A. G. A., et al. 2023, A&A, 674, A1, doi: 10.1051/0004-6361/202243940

  23. [23]

    2016, Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, https://arxiv.org/abs/1506.02142

    Gal, Y ., & Ghahramani, Z. 2016, Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, https://arxiv.org/abs/1506.02142

  24. [24]

    Gneiting, T., Balabdaoui, F., & Raftery, A. E. 2007, Journal of the Royal Statistical Society Series B: Statistical Methodology, 69, 243–268, doi: 10.1111/j.1467-9868.2007.00587.x

  25. [25]

    Maybank, and Dacheng Tao

    Gou, J., Yu, B., Maybank, S. J., & Tao, D. 2021, International Journal of Computer Vision, 129, 1789–1819, doi: 10.1007/s11263-021-01453-z

  26. [26]

    2024, A&A, 682, A9, doi: 10.1051/0004-6361/202347122

    Guiglion, G., Nepal, S., Chiappini, C., et al. 2024, A&A, 682, A9, doi: 10.1051/0004-6361/202347122

  27. [27]

    R., Millman, K

    Harris, C. R., Millman, K. J., van der Walt, S. J., et al. 2020, Nature, 585, 357, doi: 10.1038/s41586-020-2649-2

  28. [28]

    , keywords =

    Hayden, M. R., Bovy, J., Holtzman, J. A., et al. 2015, ApJ, 808, 132, doi: 10.1088/0004-637X/808/2/132

  29. [29]

    2014, A&A, 561, A93, doi: 10.1051/0004-6361/201322559

    Heiter, U., Soubiran, C., Netopil, M., & Paunzen, E. 2014, A&A, 561, A93, doi: 10.1051/0004-6361/201322559

  30. [30]

    Nature , keywords =

    Helmi, A., Babusiaux, C., Koppelman, H. H., et al. 2018, Nature, 563, 85, doi: 10.1038/s41586-018-0625-x

  31. [31]

    Query-Key Normalization for

    Henry, A., Dachapally, P. R., Pawar, S., & Chen, Y . 2020, Query-Key Normalization for Transformers, https://arxiv.org/abs/2010.04245

  32. [32]

    Distilling the Knowledge in a Neural Network

    Hinton, G., Vinyals, O., & Dean, J. 2015, Distilling the Knowledge in a Neural Network, https://arxiv.org/abs/1503.02531 21

  33. [33]

    Ho, A. Y . Q., Rix, H.-W., Ness, M. K., et al. 2017a, ApJ, 841, 40, doi: 10.3847/1538-4357/aa6db3

  34. [34]

    Ho, A. Y . Q., Ness, M. K., Hogg, D. W., et al. 2017b, ApJ, 836, 5, doi: 10.3847/1538-4357/836/1/5

  35. [35]

    W., Eilers, A.-C., & Rix, H.-W

    Hogg, D. W., Eilers, A.-C., & Rix, H.-W. 2019, The Astronomical Journal, 158, 147, doi: 10.3847/1538-3881/ab398c

  36. [36]

    Hunter, J. D. 2007, Computing in Science & Engineering, 9, 90, doi: 10.1109/MCSE.2007.55 Jofré, P., Heiter, U., & Soubiran, C. 2019, ARA&A, 57, 571, doi: 10.1146/annurev-astro-091918-104509

  37. [37]

    org/abs/1807.03888

    Lee, K., Lee, K., Lee, H., & Shin, J. 2018, A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks, https://arxiv.org/abs/1807.03888

  38. [38]

    , keywords =

    Leung, H. W., & Bovy, J. 2018, Monthly Notices of the Royal Astronomical Society, doi: 10.1093/mnras/sty3217

  39. [39]

    W., & Bovy, J

    Leung, H. W., & Bovy, J. 2023, Monthly Notices of the Royal Astronomical Society, 527, 1494–1520, doi: 10.1093/mnras/stad3015

  40. [40]

    W., Bovy, J., Mackereth, J

    Leung, H. W., Bovy, J., Mackereth, J. T., & Miglio, A. 2023, Monthly Notices of the Royal Astronomical Society, 522, 4577–4597, doi: 10.1093/mnras/stad1272

  41. [41]

    E., et al

    Li, S., Wang, W., Koposov, S. E., et al. 2025, AJ, 170, 171, doi: 10.3847/1538-3881/adf1a0

  42. [42]

    2022, Monthly Notices of the Royal Astronomical Society, 511, 5639–5655, doi: 10.1093/mnras/stac479

    Lian, J., Zasowski, G., Hasselquist, S., et al. 2022, Monthly Notices of the Royal Astronomical Society, 511, 5639–5655, doi: 10.1093/mnras/stac479

  43. [43]

    2023, Outlier Detection in the DESI Bright Galaxy Survey, https://arxiv.org/abs/2307.07664

    Liang, Y ., Melchior, P., Hahn, C., et al. 2023, Outlier Detection in the DESI Bright Galaxy Survey, https://arxiv.org/abs/2307.07664

  44. [44]

    Isolation Forest

    Liu, F. T., Ting, K. M., & Zhou, Z.-H. 2008, in 2008 Eighth IEEE International Conference on Data Mining, 413–422, doi: 10.1109/ICDM.2008.17

  45. [45]

    Dynamical heating across the Milky Way disc using APOGEE andGaia,

    Mackereth, J. T., Bovy, J., Leung, H. W., et al. 2019, Monthly Notices of the Royal Astronomical Society, 489, 176, doi: 10.1093/mnras/stz1521

  46. [46]

    R., Schiavon, R

    Majewski, S. R., Schiavon, R. P., Frinchaboy, P. M., et al. 2017, The Astronomical Journal, 154, 94, doi: 10.3847/1538-3881/aa784d

  47. [47]

    Koposov, S. E. 2018, The Shards ofωCentauri, https://arxiv.org/abs/1804.07050

  48. [48]

    R., Sharma, S., et al

    Nandakumar, G., Hayden, M. R., Sharma, S., et al. 2022, MNRAS, 513, 232, doi: 10.1093/mnras/stac873

  49. [49]

    2015, ApJ, 808, 16, doi: 10.1088/0004-637X/808/1/16 O’Briain, T., Ting, Y.-S., Fabbro, S., et al

    Ness, M., Hogg, D. W., Rix, H. W., Ho, A. Y ., & Zasowski, G. 2015, Astrophysical Journal, 808, 16, doi: 10.1088/0004-637X/808/1/16 O’Briain, T., Ting, Y .-S., Fabbro, S., et al. 2021, ApJ, 906, 130, doi: 10.3847/1538-4357/abca96

  50. [50]

    2025, Lance: Efficient Random Access in Columnar Storage through Adaptive Structural Encodings, https://arxiv.org/abs/2504.15247

    Pace, W., She, C., Xu, L., et al. 2025, Lance: Efficient Random Access in Columnar Storage through Adaptive Structural Encodings, https://arxiv.org/abs/2504.15247

  51. [51]

    Parker, F

    Parker, L., Lanusse, F., Shen, J., et al. 2025, arXiv e-prints, arXiv:2510.17960, doi: 10.48550/arXiv.2510.17960

  52. [52]

    PyTorch: An Imperative Style, High-Performance Deep Learning Library

    Paszke, A., Gross, S., Massa, F., et al. 2019, PyTorch: An Imperative Style, High-Performance Deep Learning Library, https://arxiv.org/abs/1912.01703

  53. [53]

    2011, Journal of Machine Learning Research, 12, 2825

    Pedregosa, F., Varoquaux, G., Gramfort, A., et al. 2011, Journal of Machine Learning Research, 12, 2825

  54. [54]

    H., Elsworth, Y

    Pinsonneault, M. H., Elsworth, Y . P., Tayar, J., et al. 2018, The Astrophysical Journal Supplement Series, 239, 32, doi: 10.3847/1538-4365/aaebfd

  55. [55]

    Piskunov, N., & Valenti, J. A. 2017, A&A, 597, A16, doi: 10.1051/0004-6361/201629124

  56. [56]

    Queiroz, A. B. A., Anders, F., Chiappini, C., et al. 2023, A&A, 673, A155, doi: 10.1051/0004-6361/202245399

  57. [57]

    Raasveldt, M., & Mühleisen, H. 2019, in Proceedings of the 2019 International Conference on Management of Data, SIGMOD ’19 (New York, NY , USA: Association for Computing Machinery), 1981–1984, doi: 10.1145/3299869.3320212

  58. [58]

    A., et al

    Recio-Blanco, A., de Laverny, P., Palicio, P. A., et al. 2023, A&A, 674, A29, doi: 10.1051/0004-6361/202243750

  59. [59]

    R., Weisz, D

    Sandford, N. R., Weisz, D. R., & Ting, Y .-S. 2020, The Astrophysical Journal Supplement Series, 249, 24, doi: 10.3847/1538-4365/ab9cb0

  60. [60]

    DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

    Sanh, V ., Debut, L., Chaumond, J., & Wolf, T. 2020, DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter, https://arxiv.org/abs/1910.01108

  61. [61]

    GLU Variants Improve Transformer

    Shazeer, N. 2020, GLU Variants Improve Transformer, https://arxiv.org/abs/2002.05202

  62. [62]

    2026, homogeneous_stellar_parameters, Zenodo, doi: 10.5281/ZENODO.19830516

    Shen, J. 2026, homogeneous_stellar_parameters, Zenodo, doi: 10.5281/ZENODO.19830516

  63. [63]

    2023, Multiscale Feature Attribution for Outliers, https://arxiv.org/abs/2310.20012

    Shen, J., & Melchior, P. 2023, Multiscale Feature Attribution for Outliers, https://arxiv.org/abs/2310.20012

  64. [64]

    S., Mackereth, J

    Shen, J., Speagle, J. S., Mackereth, J. T., Ting, Y .-S., & Bovy, J. 2024, ApJ, 960, 84, doi: 10.3847/1538-4357/ad0559

  65. [65]

    Universal spectral tokenization via self-supervised panchromatic representation learning, 2025

    Shen, J., Lanusse, F., Holden Parker, L., et al. 2025, arXiv e-prints, arXiv:2510.17959, doi: 10.48550/arXiv.2510.17959

  66. [66]

    Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

    Simonyan, K., Vedaldi, A., & Zisserman, A. 2014, Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps, https://arxiv.org/abs/1312.6034

  67. [67]

    & Wattenberg, M

    Smilkov, D., Thorat, N., Kim, B., Viégas, F., & Wattenberg, M. 2017, SmoothGrad: removing noise by adding noise, https://arxiv.org/abs/1706.03825

  68. [68]

    M., et al

    Spina, L., Ting, Y .-S., De Silva, G. M., et al. 2021, MNRAS, 503, 3279, doi: 10.1093/mnras/stab471

  69. [69]

    RoFormer: Enhanced Transformer with Rotary Position Embedding

    Su, J., Lu, Y ., Pan, S., et al. 2023, RoFormer: Enhanced Transformer with Rotary Position Embedding, https://arxiv.org/abs/2104.09864 22

  70. [70]

    Axiomatic Attribution for Deep Networks, June 2017

    Sundararajan, M., Taly, A., & Yan, Q. 2017, Axiomatic Attribution for Deep Networks, https://arxiv.org/abs/1703.01365

  71. [71]

    2021, Understanding and Improving Knowledge Distillation, https://arxiv.org/abs/2002.03532

    Tang, J., Shivanna, R., Zhao, Z., et al. 2021, Understanding and Improving Knowledge Distillation, https://arxiv.org/abs/2002.03532

  72. [72]

    2021, A&A, 647, A19, doi: 10.1051/0004-6361/202039388

    Tarricq, Y ., Soubiran, C., Casamiquela, L., et al. 2021, A&A, 647, A19, doi: 10.1051/0004-6361/202039388

  73. [73]

    Gemma 2: Improving Open Language Models at a Practical Size

    Team, G., Riviere, M., Pathak, S., et al. 2024, Gemma 2: Improving Open Language Models at a Practical Size, https://arxiv.org/abs/2408.00118

  74. [74]

    DiMarco, S. F. 2016, Oceanography, 29, doi: 10.5670/oceanog.2016.66

  75. [75]

    2017, ApJ, 843, 32, doi: 10.3847/1538-4357/aa7688

    Ting, Y .-S., Conroy, C., Rix, H.-W., & Cargile, P. 2017, ApJ, 843, 32, doi: 10.3847/1538-4357/aa7688

  76. [76]

    2022, A&A, 659, A95, doi: 10.1051/0004-6361/202141702

    Tsantaki, M., Pancino, E., Marrese, P., et al. 2022, A&A, 659, A95, doi: 10.1051/0004-6361/202141702

  77. [77]

    2025, A&A, 700, A195, doi: 10.1051/0004-6361/202555695

    Turchi, A., Pancino, E., Avdeeva, A., et al. 2025, A&A, 700, A195, doi: 10.1051/0004-6361/202555695

  78. [78]

    Attention Is All You Need

    Vaswani, A., Shazeer, N., Parmar, N., et al. 2017, Attention Is All You Need, arXiv. http://arxiv.org/abs/1706.03762

  79. [79]

    2020, ApJ, 898, 58, doi: 10.3847/1538-4357/ab9a46

    Wheeler, A., Ness, M., Buder, S., et al. 2020, ApJ, 898, 58, doi: 10.3847/1538-4357/ab9a46

  80. [80]

    J., Johnson, J

    Yanny, B., Rockosi, C., Newberg, H. J., et al. 2009, AJ, 137, 4377, doi: 10.1088/0004-6256/137/5/4377

Showing first 80 references.