pith. machine review for the scientific record. sign in

arxiv: 2605.06636 · v1 · submitted 2026-05-07 · 🌌 astro-ph.IM · gr-qc

Recognition: unknown

What You Don't Know Won't Hurt You: Self-Consistent Hierarchical Inference with Unknown Follow-up Selection Strategies

Authors on Pith no claims yet

Pith reviewed 2026-05-08 04:55 UTC · model grok-4.3

classification 🌌 astro-ph.IM gr-qc
keywords hierarchical Bayesian inferencepopulation inferenceselection effectsfollow-up observationsastronomical surveysgravitational wavestransient sources
0
0 comments X

The pith

Hierarchical Bayesian inference recovers intrinsic populations without modeling unknown follow-up selection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Astronomical surveys often generate many candidate sources whose follow-up is decided by criteria that are hard to model or vary across observers. The paper shows that hierarchical Bayesian inference of the underlying population does not require explicit modeling of how follow-up candidates are chosen. Accurate recovery of the intrinsic population remains possible even when those choices correlate with the parameters being estimated. The hierarchical structure accounts for selection at the detection stage, bypassing the need to describe later decisions. This simplification applies as long as the initial detection process, including any contaminants, is properly modeled.

Core claim

We show that explicitly modeling the follow-up selection process is not required for self-consistent inference of the intrinsic population. Using the framework of hierarchical Bayesian inference, the intrinsic population can be accurately inferred even when the decision to follow up candidates strongly correlates with latent parameters of interest. We provide several worked examples, showing that the precision of posterior constraints can depend on the follow-up process and that one may have to model a population of contaminants if the initial selection is imperfect.

What carries the argument

Hierarchical Bayesian inference that integrates over individual source parameters while conditioning only on the modeled initial detection selection and marginalizing over unknown follow-up decisions.

If this is right

  • The precision of posterior constraints on the population depends on the details of the follow-up process.
  • Contaminants in the initial selection must be modeled explicitly to avoid bias in the inferred population.
  • Population inference can use follow-up data from multiple independent observers without knowledge of their selection strategies.
  • The approach applies directly to high-volume surveys such as LSST, Gaia, and next-generation gravitational-wave detectors.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same hierarchical structure may simplify inference in other fields that face layered and partially unknown selection effects.
  • Mock-data tests with deliberately correlated follow-up rules could be used to verify recovery of known populations.
  • Real-time population updates become more practical during ongoing surveys since not every follow-up decision needs to be tracked.

Load-bearing premise

The initial detection or candidate selection process must be either perfect or its imperfections such as contaminants must be explicitly modeled.

What would settle it

A simulation with a known true population and a follow-up rule that depends on the latent parameters of interest; the method should recover the true population parameters within posterior uncertainties when the follow-up rule is left unmodeled.

Figures

Figures reproduced from arXiv: 2605.06636 by Amanda M. Farah, Reed Essick.

Figure 1
Figure 1. Figure 1: (left) Distributions of true event parameters and data for an example mock catalog with 500 detected events of which 56 received follow-up: (blue) distributions of detected sources and (orange) distributions of sources that received follow-up. We show distributions over the true event parameters (θ), original catalog data (x), and follow-up data (f). See Sec. 3.1 and Appendix B for more details. This catal… view at source ↗
Figure 2
Figure 2. Figure 2: Directed acyclic graph (DAG) representing the type of observations we consider. There are ND detected events in total, but only NF ≤ ND receive follow-up. Importantly, both detection (D) and the decision to follow-up (F) or not (F✁) for each event depend only on the initial catalog data (x). Filled nodes correspond to observed variables and unfilled nodes denote latent variables. Plates (rectangular boxes)… view at source ↗
Figure 3
Figure 3. Figure 3: (left) Estimated detector-frame masses (Mˆ det) and distances (Dˆ) for individual events within a mock catalog of 50 detected events. Events that are followed-up by different schemes are circled in the corresponding color. The grey shaded region denotes non-detections (ˆρ ≤ ρˆthr). (right) Posteriors for the Hubble parameter (H0, scaled by the injected value Hinj) using the same catalog but assuming differ… view at source ↗
Figure 4
Figure 4. Figure 4: (bottom left) Distributions of population parameters (see Appendix G) for an example catalog with 1396 candidates of which only the 100 with the largest ˆα 2+βˆ2 were followed-up. Like view at source ↗
Figure 5
Figure 5. Figure 5: (left, colored lines) Differences between Bayesian and frequentist quantiles (∆ quantile) vs. Bayesian quantile (corresponding to the highest-probability-density region that marginally contains the true parameters) for posteriors inferred using different follow-up strategies and (grey shading) expected uncertainty from the finite number of trials used. (right) Cumulative fractions of trials as a function o… view at source ↗
Figure 6
Figure 6. Figure 6: A simple model of active learning in which future follow-up decisions are based on all previous data through, presumably, an inference of the population constructed with all previously recorded data. Nodes are colored according to which events they are associated with (events 1, 2, and 3 are colored blue, red, and grey, respectively) view at source ↗
Figure 7
Figure 7. Figure 7: DAG representing a related observational scheme in which events are observed in two instruments with data x and f, respectively. We select events based on x (D) but do not condition on x itself. Eq. E24 assumes that we do not select based on f (F) whereas Eq. E26 does. D. DISCARDING FOLLOW-UP DATA For the sake of completeness, we now show that one cannot discard follow-up data (f) based on the same follow-… view at source ↗
Figure 8
Figure 8. Figure 8: Coverage tests for the Gaia mixture model (Sec. 3.3 and Appendix G), analogous to view at source ↗
read the original abstract

Many astronomical surveys prompt follow-up observations, but the decision process through which candidates are selected for follow-up can be difficult to model. This poses a challenge when inferring properties of the intrinsic population of astrophysical sources, rather than those of the set of objects detected by the survey and often-incomplete follow-up observations. We alleviate this problem by demonstrating that explicitly modeling of the follow-up selection process is not required for self-consistent inference of the intrinsic population. Using the framework of hierarchical Bayesian inference, we show that the intrinsic population can be accurately inferred even when the decision to follow up candidates strongly correlates with latent parameters of interest. We provide several worked examples, showing that the precision of posterior constraints can depend on the follow-up process and that one may have to model a population of contaminants if the initial selection is imperfect. Our result could dramatically simplify population inference that incorporates uncoordinated follow-up from multiple observers triggered by the deluge of candidates from surveys like LSST, Gaia, and next-generation gravitational-wave interferometers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript claims that hierarchical Bayesian inference allows accurate recovery of the intrinsic astrophysical population parameters even when the follow-up selection strategy is unknown and may correlate arbitrarily with latent parameters of interest. This is supported by several worked examples; the text notes that a population of contaminants must be modeled if the initial detection/selection step is imperfect, but argues that the follow-up step itself need not be modeled explicitly.

Significance. If the central result holds under the stated conditions, the work would substantially simplify population inference for surveys that generate large numbers of candidates with uncoordinated or poorly documented follow-up (LSST, Gaia, next-generation GW detectors). The provision of concrete worked examples is a strength that aids reproducibility and practical adoption.

major comments (1)
  1. [Abstract] Abstract and the discussion of the initial selection process: the central claim that follow-up selection need not be modeled is presented as general, yet the text explicitly states that contaminants must be modeled when initial selection is imperfect. No derivation or worked example is referenced that demonstrates unbiased recovery when both the initial filter and the follow-up strategy are unmodeled and their product becomes the effective selection function. This assumption is load-bearing for the generality of the result.
minor comments (1)
  1. [Methods] The notation distinguishing the initial detection probability, the follow-up probability, and the effective selection function should be introduced once and used consistently in the methods and examples sections to avoid reader confusion.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments. The major comment raises a valid point about the scope and presentation of our central claim, and we have revised the manuscript to clarify the assumptions regarding initial selection and to better distinguish the follow-up component from the effective selection function.

read point-by-point responses
  1. Referee: [Abstract] Abstract and the discussion of the initial selection process: the central claim that follow-up selection need not be modeled is presented as general, yet the text explicitly states that contaminants must be modeled when initial selection is imperfect. No derivation or worked example is referenced that demonstrates unbiased recovery when both the initial filter and the follow-up strategy are unmodeled and their product becomes the effective selection function. This assumption is load-bearing for the generality of the result.

    Authors: We agree that the abstract and introductory discussion would benefit from greater precision on this point. Our result is specifically that the follow-up selection strategy need not be modeled explicitly, because the hierarchical Bayesian framework allows marginalization over unknown follow-up decisions (even when they correlate arbitrarily with latent parameters) without biasing the inferred population parameters. This holds provided the initial detection/selection step is either perfect or that any resulting contaminants are modeled as an additional population component, as already stated in the manuscript. We do not claim that the product of two unknown selection functions can be ignored; on the contrary, the text explicitly requires contaminant modeling precisely when the initial step is imperfect. The worked examples demonstrate recovery under unknown follow-up while properly accounting for the initial selection. We have revised the abstract to reference the relevant sections on initial selection and contaminants, and added a clarifying sentence in the discussion to emphasize that the effective selection function must still be handled via the initial-step modeling. No new derivation is required, as the marginalization over follow-up is a direct consequence of the hierarchical model structure. revision: yes

Circularity Check

0 steps flagged

No circularity; result is a general property of hierarchical Bayesian marginalization

full rationale

The paper demonstrates via hierarchical Bayesian inference that population parameters remain recoverable without explicit follow-up selection modeling, provided the initial detection is either perfect or contaminants are included in the model. This is shown through worked examples rather than a closed derivation that reduces to its own inputs. The abstract states the caveat directly, and no self-citations, fitted predictions renamed as results, or self-definitional steps are present in the provided text. The central claim is a robustness property of the inference framework itself and does not loop back to assumptions that presuppose the conclusion.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the standard hierarchical Bayesian inference framework for population studies; no new free parameters or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Hierarchical Bayesian inference framework applies to inferring intrinsic populations from selected observations.
    This is the standard setup invoked in the abstract for population inference.

pith-pipeline@v0.9.0 · 5478 in / 1124 out tokens · 71148 ms · 2026-05-08T04:55:53.512357+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

55 extracted references · 54 canonical work pages · 3 internal anchors

  1. [1]

    P., Abbott, R., et al

    Aasi, J., et al. 2015, Class. Quant. Grav., 32, 074001, doi: 10.1088/0264-9381/32/7/074001

  2. [2]

    GWTC-4.0: Population Properties of Merging Compact Binaries

    Abac, A. G., et al. 2025, https://arxiv.org/abs/2508.18083

  3. [3]

    2017, PhRvL, 119, 161101, doi: 10.1103/PhysRevLett.119.161101

    Abbott, B. P., et al. 2017a, Phys. Rev. Lett., 119, 161101, doi: 10.1103/PhysRevLett.119.161101

  4. [4]

    , keywords =

    Abbott, B. P., et al. 2017b, Astrophys. J. Lett., 848, L12, doi: 10.3847/2041-8213/aa91c9

  5. [5]

    2015, Classical and Quantum Gravity, 32, 024001, doi: 10.1088/0264-9381/32/2/024001

    Acernese, F., et al. 2015, Class. Quant. Grav., 32, 024001, doi: 10.1088/0264-9381/32/2/024001

  6. [6]

    Progress of Theoretical and Experimental Physics , keywords =

    Akutsu, T., et al. 2021, Progress of Theoretical and Experimental Physics, 2021, 05A101, doi: 10.1093/ptep/ptaa125

  7. [7]

    2018, The Astrophysical Journal, 863, 48, doi: 10.3847/1538-4357/aad188

    Amiri, M., Bandura, K., Berger, P., et al. 2018, ApJ, 863, 48, doi: 10.3847/1538-4357/aad188

  8. [8]

    J., et al

    Bianchi, D., Burden, A., Percival, W. J., et al. 2018, Mon. Not. Roy. Astron. Soc., 481, 2338, doi: 10.1093/mnras/sty2377

  9. [9]

    , keywords =

    Bianco, F. B., Ivezi´ c,ˇZ., Jones, R. L., et al. 2022, ApJS, 258, 1, doi: 10.3847/1538-4365/ac3e72

  10. [10]

    2026, The Astrophysical Journal, 1001, 208, doi: 10.3847/1538-4357/ae589c

    Boone, K., Ferguson, P., Tabbutt, M., et al. 2026, The Astrophysical Journal, 1001, 208, doi: 10.3847/1538-4357/ae589c

  11. [11]

    2018, JAX: composable transformations of Python+NumPy programs, 0.3.13 http://github.com/jax-ml/jax

    Bradbury, J., Frostig, R., Hawkins, P., et al. 2018, JAX: composable transformations of Python+NumPy programs, 0.3.13 http://github.com/jax-ml/jax

  12. [12]

    , keywords =

    Burke, C. J., Christiansen, J. L., Mullally, F., et al. 2015, The Astrophysical Journal, 809, 8, doi: 10.1088/0004-637X/809/1/8

  13. [13]

    Chen, H.-Y., Fishbach, M., & Holz, D. E. 2018, Nature, 562, 545, doi: 10.1038/s41586-018-0606-0

  14. [14]

    Chen, H.-Y., Talbot, C., & Chase, E. A. 2024, Phys. Rev. Lett., 132, 191003, doi: 10.1103/PhysRevLett.132.191003

  15. [15]

    D., Kochanek, C

    Desai, D. D., Kochanek, C. S., Shappee, B. J., et al. 2024, MNRAS, 530, 5016, doi: 10.1093/mnras/stae606

  16. [16]

    D., Shappee, B

    Desai, D. D., Shappee, B. J., Kochanek, C. S., et al. 2026, arXiv e-prints, arXiv:2602.00223, doi: 10.48550/arXiv.2602.00223

  17. [17]

    2024, The Open Journal of Astrophysics, 7, 100, doi: 10.33232/001c.125461

    El-Badry, K., Lam, C., Holl, B., et al. 2024, The Open Journal of Astrophysics, 7, 100, doi: 10.33232/001c.125461

  18. [18]

    doi:10.1093/mnras/stac3140 , arxivId =

    El-Badry, K., Rix, H.-W., Quataert, E., et al. 2023a, MNRAS, 518, 1057, doi: 10.1093/mnras/stac3140

  19. [19]

    2023b, MNRAS, 521, 4323, doi: 10.1093/mnras/stad799

    El-Badry, K., Rix, H.-W., Cendes, Y., et al. 2023b, MNRAS, 521, 4323, doi: 10.1093/mnras/stad799

  20. [20]

    2023, Phys

    Essick, R. 2023, Phys. Rev. D, 108, 043011, doi: 10.1103/PhysRevD.108.043011

  21. [21]

    2026, Astrophys

    Essick, R. 2026, Astrophys. J., 997, 76, doi: 10.3847/1538-4357/ae2255

  22. [22]

    2022, The Astrophysical Journal, 926, 34, doi: 10.3847/1538-4357/ac3978

    Essick, R., Farah, A., Galaudage, S., et al. 2022, The Astrophysical Journal, 926, 34, doi: 10.3847/1538-4357/ac3978

  23. [23]

    Ensuring Consistency between Noise and Detection in Hierarchical Bayesian Inference

    Essick, R., & Fishbach, M. 2024, The Astrophysical Journal, 962, 169, doi: 10.3847/1538-4357/ad1604

  24. [24]

    Essick, R., & Holz, D. E. 2024, Phys. Rev. D, 110, 103018, doi: 10.1103/PhysRevD.110.103018

  25. [25]

    W., Zevin, M., et al

    Essick, R., et al. 2025, Phys. Rev. D, 112, 102001, doi: 10.1103/44x3-hv3y

  26. [26]

    and Callister, Thomas A

    Farah, A. M., Callister, T. A., Ezquiaga, J. M., Zevin, M., & Holz, D. E. 2025, Astrophys. J., 978, 153, doi: 10.3847/1538-4357/ad9253

  27. [27]

    and others

    Gair, J. R., Ghosh, A., Gray, R., et al. 2023, The Astronomical Journal, 166, 22, doi: 10.3847/1538-3881/acca78

  28. [28]

    2023, arXiv e-prints, arXiv:2304.01288, doi: 10.48550/arXiv.2304.01288

    Godfrey, J., Edelman, B., & Farr, B. 2023, https://arxiv.org/abs/2304.01288 21

  29. [29]
  30. [30]

    R., Millman, K

    Harris, C. R., Millman, K. J., van der Walt, S. J., et al. 2020, Nature, 585, 357, doi: 10.1038/s41586-020-2649-2

  31. [31]

    Hunter, J. D. 2007, Computing in Science & Engineering, 9, 90, doi: 10.1109/MCSE.2007.55 Ivezi´ c,ˇZ., Kahn, S. M., Tyson, J. A., et al. 2019, ApJ, 873, 111, doi: 10.3847/1538-4357/ab042c

  32. [32]

    G., Borucki, W

    Koch, D. G., Borucki, W. J., Basri, G., et al. 2010, ApJL, 713, L79, doi: 10.1088/2041-8205/713/2/L79

  33. [33]

    Y., El-Badry, K., & Simon, J

    Lam, C. Y., El-Badry, K., & Simon, J. D. 2025, The Astrophysical Journal, 987, 215, doi: 10.3847/1538-4357/addac2

  34. [34]

    E., et al

    Levi, M. E., et al. 2019, https://arxiv.org/abs/1907.10688

  35. [35]

    S., Koposov, S

    Li, T. S., Koposov, S. E., Zucker, D. B., et al. 2019, MNRAS, 490, 3508, doi: 10.1093/mnras/stz2731

  36. [36]

    2018, A&A, 616, A2, doi: 10.1051/0004-6361/201832727

    Lindegren, L., Hern´ andez, J., Bombrun, A., et al. 2018, A&A, 616, A2, doi: 10.1051/0004-6361/201832727 LVK. 2021a, GWTC-3: Compact Binary Coalescences Observed by LIGO and Virgo During the Second Part of the Third Observing Run — O3 search sensitivity estimates, Zenodo, doi: 10.5281/zenodo.5546676 LVK. 2021b, GWTC-3: Compact Binary Coalescences Observed...

  37. [37]

    Estimates, Zenodo, doi: 10.5281/zenodo.16740128

  38. [38]
  39. [39]

    M., & Gair, J

    Mandel, I., Farr, W. M., & Gair, J. R. 2019, Mon. Not. Roy. Astron. Soc., 486, 1086, doi: 10.1093/mnras/stz896

  40. [40]

    2012, PhRvL, 108, 091101, doi: 10.1103/PhysRevLett.108.091101

    Messenger, C., & Read, J. 2012, Phys. Rev. Lett., 108, 091101, doi: 10.1103/PhysRevLett.108.091101

  41. [41]

    Dormant black hole candidates from Gaia DR3 summary diagnostics

    Mould, M., Moore, C. J., & Gerosa, D. 2024, Phys. Rev. D, 109, 063013, doi: 10.1103/PhysRevD.109.063013 M¨ uller-Horn, J., Rix, H.-W., El-Badry, K., et al. 2025, arXiv e-prints, arXiv:2510.05982, doi: 10.48550/arXiv.2510.05982

  42. [42]

    and Desai, D

    Pessi, T., Desai, D. D., Prieto, J. L., et al. 2025, A&A, 703, A34, doi: 10.1051/0004-6361/202556799

  43. [43]
  44. [44]

    Prusti, T., de Bruijne, J. H. J., Brown, A. G. A., et al. 2016, A&A, 595, A1, doi: 10.1051/0004-6361/201629272

  45. [45]

    R., Winn, J

    Ricker, G. R., Winn, J. N., Vanderspek, R., et al. 2014, in Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol. 9143, Space Telescopes and Instrumentation 2014: Optical, Infrared, and Millimeter Wave, ed. J. M. Oschmann, Jr., M. Clampin, G. G. Fazio, & H. A. MacEwen, 914320, doi: 10.1117/12.2063489 Rosado-Mar´ ın, A. J., et a...

  46. [46]

    J., et al

    Ross, A. J., et al. 2025, JCAP, 01, 125, doi: 10.1088/1475-7516/2025/01/125

  47. [47]

    Mitigating the Binary Viewing Angle Bias for Standard Sirens

    Salvarese, A., & Chen, H.-Y. 2024, Astrophys. J. Lett., 974, L16, doi: 10.3847/2041-8213/ad7bbc

  48. [48]

    Schutz, B. F. 1986, Nature, 323, 310, doi: 10.1038/323310a0

  49. [49]

    I., Batalha, N., Thompson, S

    Shabram, M. I., Batalha, N., Thompson, S. E., et al. 2020, AJ, 160, 16, doi: 10.3847/1538-3881/ab90fe

  50. [50]

    2019, Monthly Notices of the Royal Astronomical Society, 487, 5610, doi: 10.1093/mnras/stz1636

    Shahaf, S., Mazeh, T., Faigler, S., & Holl, B. 2019, Monthly Notices of the Royal Astronomical Society, 487, 5610, doi: 10.1093/mnras/stz1636

  51. [51]

    R., Gair, J

    Taylor, S. R., Gair, J. R., & Mandel, I. 2012, Phys. Rev. D, 85, 023535, doi: 10.1103/PhysRevD.85.023535

  52. [52]

    2025b, ApJ, 985, 220, doi: 10.3847/1538-4357/adcec5

    Tong, H., Fishbach, M., & Thrane, E. 2025, Astrophys. J., 985, 220, doi: 10.3847/1538-4357/adcec5

  53. [53]

    Comparing astrophysical models to gravitational-wave data in the observable space

    Toubiana, A., et al. 2025, https://arxiv.org/abs/2507.13249

  54. [54]

    Vallenari, A., Brown, A. G. A., Prusti, T., et al. 2023, A&A, 674, A1, doi: 10.1051/0004-6361/202243940

  55. [55]

    E., et al

    Virtanen, P., Gommers, R., Oliphant, T. E., et al. 2020, Nature Methods, 17, 261, doi: 10.1038/s41592-019-0686-2