pith. machine review for the scientific record. sign in

arxiv: 2605.02969 · v1 · submitted 2026-05-03 · 🌌 astro-ph.IM · stat.AP

Recognition: unknown

The Catastrophic Consequences of Agnosticism for Life Searches and a Possible Workaround

Authors on Pith no claims yet

Pith reviewed 2026-05-09 16:22 UTC · model grok-4.3

classification 🌌 astro-ph.IM stat.AP
keywords life detectionbiosignaturesBayesian inferencefalse positivessurvey designagnostic priorsconfounders
0
0 comments X

The pith

Dividing targets into two groups with different life prevalences but identical confounder rates enables strong life detections in surveys of only a few dozen targets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that fully agnostic priors on both the occurrence rate of life and the occurrence rate of confounding signals make it nearly impossible to accumulate strong Bayesian evidence for life in realistic survey sizes. Even with hundreds of targets the posterior odds often stay inconclusive because the data can be explained equally well by a higher confounder rate. The author therefore proposes partitioning the observed targets into two groups chosen so that one is expected to host life more often than the other, while the confounder rate is forced to be the same across both. Under this split the relative number of positive detections between groups can drive the Bayes factor upward, yielding a non-negligible probability of strong evidence even when the total sample is small. A reader would care because the result reframes survey design itself as the tool that can overcome the epistemic barrier without having to guess the unknown rates in advance.

Core claim

With uninformative priors on life prevalence and confounder prevalence, the Bayes factor comparing the life hypothesis to the null requires at least roughly 10^4 surveyed targets (and for some priors up to 10^13) to reach strong evidence. Partitioning the sample into two groups that differ in life prevalence while sharing a single global confounder rate changes the scaling: for a total of 24 targets the fraction of possible outcome patterns that produce strong evidence reaches 24 percent and exceeds 50 percent once the total reaches 76 targets.

What carries the argument

The two-group partitioning strategy (an AB-test analogue) in which targets are assigned to groups expected to differ in life prevalence while the confounder rate is required to be identical and global, allowing the differential positive rate to update the odds ratio.

If this is right

  • Survey planners must deliberately select two target sets whose expected life prevalences differ substantially.
  • With a total of 24 targets the method still yields only a 24 percent chance of strong evidence, so multiple independent surveys or larger samples are needed for high overall success probability.
  • The approach preserves complete agnosticism on the absolute value of the confounder rate and avoids the sensitivity problems that arise when an arbitrary upper limit is imposed instead.
  • The same partitioning logic applies to any biosignature or technosignature search that faces unknown false-positive rates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If real astrophysical confounders turn out to vary systematically between the two groups, the method's advantage disappears and the original catastrophic scaling returns.
  • The strategy suggests that target lists should be chosen to maximize contrast in habitability metrics rather than to maximize average habitability.
  • The same split-sample idea could be tested on existing catalog data by retrospectively assigning groups according to stellar or planetary properties.

Load-bearing premise

Two groups of targets can be chosen such that life is substantially more likely in one than the other while the confounding-signal rate stays exactly the same across both.

What would settle it

Perform a 24-target survey split into two groups differing in expected life prevalence; if the observed pattern of positives produces a strong Bayes factor favoring life in fewer than roughly one-quarter of random realizations drawn from the diffuse priors, the claimed improvement collapses.

Figures

Figures reproduced from arXiv: 2605.02969 by David Kipping.

Figure 1
Figure 1. Figure 1: Matrix of possible experimental labels that are assigned in an idealized binary classification experiment. A positive detection label is given by the symbol S, whereas a non-detection label is S¯. A positive detection will be assigned in three of the four possible outcomes. In each quadrant, we present the label and the associated probability. 2.4. Likelihood Function For a single target, a positive detect… view at source ↗
Figure 2
Figure 2. Figure 2: Assuming a known confounder rate (C), we here show the Bayes factor of the null hypothesis that life is absent, f = 0, versus the affirmative hypothesis that life is present, f > 0, as a function of the number of positive detection labels obtained by an HWO-like survey (Ntot = 25). The four different markers correspond to four different choices for C - here unrealistically treated as a perfectly known fixe… view at source ↗
Figure 3
Figure 3. Figure 3: Assuming a diffuse prior for the confounder rate (C), we here show the Bayes factor of the null hypothesis that life is absent, f = 0, versus the affirmative hypothesis that life is present, f > 0, as a function of the number of positive detection labels obtained by an HWO-like survey (Ntot = 25). The three different markers correspond to three different choices for the priors on f and C. The figure reveal… view at source ↗
Figure 4
Figure 4. Figure 4: Probability distribution of Npos, the number of positive detection labels that manifest under three different choices for the prior distribution of f and C. One might expect that the double-uniform and double-Jeffrey’s priors are ideal diffuse priors, but as one can see here they produce a strong bias towards large Npos values. The final prior, G[0, 1], is engineered to produce a uniform distribution in Np… view at source ↗
Figure 5
Figure 5. Figure 5: Grid plot of the log-Bayes factor for the null (no life) hypothesis versus the affirmative (life) hypothesis (Kf0) when a sample of Ntot = 24 targets are split into two groups with differing life rates but a global confounder rate. Although a large zone of ambiguity persists, the life hypothesis can be strongly claimed in 40 of 169 possible outcomes. For comparison, with a monolithic population we find tha… view at source ↗
Figure 6
Figure 6. Figure 6: Three examples of the marginalized posteriors obtained for fA, fB and C when Ntot = 24 and is divided into two groups, A and B. Each panel corresponds to a grid position show in view at source ↗
Figure 7
Figure 7. Figure 7: Assuming an evenly-divided AB sample of Ntot targets, we show the fraction of experimental outcomes that lead to strong evidence in favor of one of the two hypotheses, the life- and null-hypothesis. The fraction of outcomes that favor a null result converges to ∼ 6.3%, whereas the pathway to a life detection grows with Ntot, exceeding half of samples (horizontal line) for Ntot ≥ 76 (red circles). For eithe… view at source ↗
read the original abstract

Planned and ongoing searches for life, both biological and technological, confront an epistemic barrier concerning false positives - namely, that we don't know what we don't know. The most defensible and agnostic approach is to adopt diffuse (uninformative) priors, not only for the prevalence of life, but also for the prevalence of confounders. We evaluate the resulting Bayes factors between the null and life hypotheses for an idealized experiment with $N_{pos}$ positive labels (biosignature detections) among $N_{tot}$ targets with various priors. Using diffuse priors, the consequences are catastrophic for life detection, requiring at least ${\sim}10^4$ (for some priors ${\sim}10^{13}$) surveyed targets to ever obtain "strong evidence" for life. Accordingly, an HWO-scale survey with $N_{tot}{\sim}25$ would have no prospect of achieving this goal. A previously suggested workaround is to forgo the agnostic confounder prior, by asserting some upper limit on it for example, but we find that the results can be highly sensitive to this choice - as well as difficult to justify. Instead, we suggest a novel solution that retains agnosticism: by dividing the sample into two groups for which the prevalence of life differs, but the confounder rate is global. We show that a $N_{tot}=24$ survey could expect 24% of possible outcomes to produce strong life detections with this strategy, rising to $\geq50$% for $N_{tot}\geq76$. However, AB-testing introduces its own unique challenges to survey design, requiring two groups with differing life prevalence rates (ideally greatly so) but a global confounder rate.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper argues that agnostic Bayesian inference with diffuse priors on both life prevalence p_L and confounder prevalence p_C requires impractically large surveys (∼10^4 to 10^13 targets) to obtain strong evidence for life in biosignature searches. It proposes a workaround that partitions targets into two groups with differing p_L but a single global p_C, and reports via Monte-Carlo sampling that this yields a 24% probability of strong detections for N_tot=24, rising to ≥50% for N_tot≥76.

Significance. If the central modeling assumptions hold, the work quantifies the severe impact of uninformative priors on life-detection prospects and supplies concrete numerical guidance for survey design via the AB-testing strategy. The Monte-Carlo evaluation over binomial outcomes provides a reproducible framework for assessing Bayes-factor outcomes under the proposed partitioning, which could inform HWO-scale mission planning if the shared-confounder premise can be justified.

major comments (2)
  1. [workaround derivation and Monte-Carlo results] The reported 24% fraction of strong-evidence outcomes for N_tot=24 (and the scaling to N_tot≥76) is obtained under the assumption that p_C is exactly identical across the two groups while p_L differs. No sensitivity analysis is provided for even modest group-dependent variation in p_C (e.g., Δp_C∼0.01–0.05 arising from differing abiotic processes). If this assumption is violated, the likelihood ratio no longer cleanly isolates the life hypothesis, so the quoted success probabilities do not apply.
  2. [Bayesian model and numerical results] The definition of 'strong evidence' and the precise functional form of the diffuse priors on p_L and p_C are not stated explicitly in the main text; the numerical thresholds (10^4 and 10^13) and the 24%/50% figures therefore cannot be independently reproduced without the supplementary derivation. The manuscript should supply the exact prior densities and the Bayes-factor cutoff used in the sampling.
minor comments (1)
  1. [abstract and results paragraph] The abstract states 'N_tot=24 survey could expect 24% of possible outcomes'; the corresponding section should clarify whether this is an expectation over the binomial sampling distribution or a different averaging procedure.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed review of our manuscript. We address each major comment below and indicate the revisions we will make to improve the clarity, reproducibility, and robustness of the work.

read point-by-point responses
  1. Referee: The reported 24% fraction of strong-evidence outcomes for N_tot=24 (and the scaling to N_tot≥76) is obtained under the assumption that p_C is exactly identical across the two groups while p_L differs. No sensitivity analysis is provided for even modest group-dependent variation in p_C (e.g., Δp_C∼0.01–0.05 arising from differing abiotic processes). If this assumption is violated, the likelihood ratio no longer cleanly isolates the life hypothesis, so the quoted success probabilities do not apply.

    Authors: We agree that the assumption of an exactly shared global confounder rate p_C is central to the AB-testing strategy and that violations could affect the isolation of the life hypothesis. The manuscript already notes the requirement for a global confounder rate as a key challenge of the approach. To address the concern directly, the revised manuscript will include a new sensitivity analysis via Monte Carlo sampling that introduces modest group-dependent differences (Δp_C = 0.01–0.05) and reports the resulting degradation in the probability of strong detections. This will clarify the conditions under which the quoted success rates remain approximately valid and will be presented as a limitation of the method when the shared-confounder premise cannot be justified. revision: partial

  2. Referee: The definition of 'strong evidence' and the precise functional form of the diffuse priors on p_L and p_C are not stated explicitly in the main text; the numerical thresholds (10^4 and 10^13) and the 24%/50% figures therefore cannot be independently reproduced without the supplementary derivation. The manuscript should supply the exact prior densities and the Bayes-factor cutoff used in the sampling.

    Authors: We appreciate the referee highlighting this issue of reproducibility. The diffuse priors employed are uniform densities on the interval [0,1] for both p_L and p_C, and 'strong evidence' is defined as a Bayes factor exceeding 10 in favor of the life hypothesis (following the conventional Jeffreys scale). The Monte Carlo procedure samples binomial outcomes under these priors to compute the fraction of realizations yielding strong evidence. In the revised manuscript we will state these definitions, the exact prior forms, and the Bayes-factor threshold explicitly in the main text, ensuring all numerical results can be reproduced without reference to the supplement. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation uses external priors and forward simulation

full rationale

The paper computes Bayes factors under diffuse priors on prevalences and estimates outcome fractions via Monte Carlo sampling of binomial draws under an explicit two-group model with shared confounder rate. These steps are forward calculations from stated assumptions rather than reductions of outputs to inputs by construction. No self-definitional equations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the derivation chain; the 24% figure is a simulation result conditional on the proposed AB-test structure, not a tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that Bayesian model comparison with diffuse priors is the appropriate epistemic stance, that 'strong evidence' corresponds to a conventional Bayes-factor threshold, and that two survey groups can be engineered with different life prevalences but identical confounder rates.

axioms (2)
  • domain assumption Bayesian updating with uninformative priors is the correct way to quantify evidence when both signal and background are unknown
    Invoked throughout the abstract as the 'most defensible and agnostic approach'
  • domain assumption A global confounder rate can be maintained while life prevalence is made to differ between two groups
    Central to the proposed workaround

pith-pipeline@v0.9.0 · 5604 in / 1376 out tokens · 34735 ms · 2026-05-09T16:22:01.429128+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 39 canonical work pages

  1. [1]

    W., Kiang, N

    Exoplanet Biosignatures: A Review of Remotely Detectable Signs of Life. Astrobiology , keywords =. doi:10.1089/ast.2017.1729 , archivePrefix =. 1705.05791 , primaryClass =

  2. [2]

    Astrobiology , keywords =

    Exoplanet Biosignatures: Observational Prospects. Astrobiology , keywords =. doi:10.1089/ast.2017.1733 , archivePrefix =. 1705.07098 , primaryClass =

  3. [3]

    arXiv e-prints , keywords =

    Searches for Technosignatures in Astronomy and Astrophysics. arXiv e-prints , keywords =. doi:10.48550/arXiv.1907.07831 , archivePrefix =. 1907.07831 , primaryClass =

  4. [4]

    doi:10.17226/26141 , adsurl =

    Pathways to Discovery in Astronomy and Astrophysics for the 2020s. doi:10.17226/26141 , adsurl =

  5. [5]

    , keywords =

    The Case for Technosignatures: Why They May Be Abundant, Long-lived, Highly Detectable, and Unambiguous. , keywords =. doi:10.3847/2041-8213/ac5824 , archivePrefix =. 2203.10899 , primaryClass =

  6. [6]

    Journal of Astronomical Telescopes, Instruments, and Systems , keywords =

    Detectors and cooling technology for direct spectroscopic biosignature characterization. Journal of Astronomical Telescopes, Instruments, and Systems , keywords =. doi:10.1117/1.JATIS.2.4.041212 , archivePrefix =. 1607.05708 , primaryClass =

  7. [7]

    Journal of Astronomical Telescopes, Instruments, and Systems , keywords =

    Baseline requirements for detecting biosignatures with the HabEx and LUVOIR mission concepts. Journal of Astronomical Telescopes, Instruments, and Systems , keywords =. doi:10.1117/1.JATIS.4.3.035001 , archivePrefix =. 1806.04324 , primaryClass =

  8. [8]

    G., Frissell, M., et al

    Wavelength Requirements for Life Detection via Reflected Light Spectroscopy of Rocky Exoplanets. arXiv e-prints , keywords =. doi:10.48550/arXiv.2507.14771 , archivePrefix =. 2507.14771 , primaryClass =

  9. [9]

    2014, ApJL, 785, L20, doi: 10.1088/2041-8205/785/2/L20

    Abiotic Oxygen-dominated Atmospheres on Terrestrial Habitable Zone Planets. , keywords =. doi:10.1088/2041-8205/785/2/L20 , archivePrefix =. 1403.2713 , primaryClass =

  10. [10]

    Earth and Planetary Science Letters , keywords =

    High stellar FUV/NUV ratio and oxygen contents in the atmospheres of potentially habitable planets. Earth and Planetary Science Letters , keywords =. doi:10.1016/j.epsl.2013.10.024 , archivePrefix =. 1310.2590 , primaryClass =

  11. [11]

    , keywords =

    Signature of life on exoplanets: Can Darwin produce false positive detections?. , keywords =. doi:10.1051/0004-6361:20020527 , adsurl =

  12. [12]

    Proceedings of the National Academy of Science , keywords =

    Some inconvenient truths about biosignatures involving two chemical species on Earth-like exoplanets. Proceedings of the National Academy of Science , keywords =. doi:10.1073/pnas.1401816111 , archivePrefix =. 1404.6531 , primaryClass =

  13. [13]

    , keywords =

    Abiotic Production of Dimethyl Sulfide, Carbonyl Sulfide, and Other Organosulfur Gases via Photochemistry: Implications for Biosignatures and Metabolic Potential. , keywords =. doi:10.3847/2041-8213/ad74da , adsurl =

  14. [14]

    , keywords =

    Sensitivity of biosignatures on Earth-like planets orbiting in the habitable zone of cool M-dwarf Stars to varying stellar UV radiation and surface biomass emissions. , keywords =. doi:10.1016/j.pss.2013.10.006 , archivePrefix =. 1507.02823 , primaryClass =

  15. [15]

    Proceedings of the Royal Society of London

    Lowell, Percival , title =. Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character , volume =. 1906 , month =. doi:10.1098/rspa.1906.0010 , url =

  16. [16]

    The New York Times , year =

    Whiting, Lilian , title =. The New York Times , year =

  17. [17]

    1999 , isbn =

    Chambers, Paul , title =. 1999 , isbn =

  18. [18]

    and Sternberg, Karin , title =

    Sternberg, Robert J. and Sternberg, Karin , title =. 2012 , isbn =

  19. [19]

    , year = 1903, month = jun, volume =

    Experiments as to the actuality of the ``Canals'' observed on Mars. , year = 1903, month = jun, volume =. doi:10.1093/mnras/63.8.488 , adsurl =

  20. [20]

    Astrobiology , keywords =

    Phosphine as a Biosignature Gas in Exoplanet Atmospheres. Astrobiology , keywords =. doi:10.1089/ast.2018.1954 , archivePrefix =. 1910.05224 , primaryClass =

  21. [21]

    S., Richards, A

    Phosphine gas in the cloud decks of Venus. Nature Astronomy , keywords =. doi:10.1038/s41550-020-1174-4 , archivePrefix =. 2009.06593 , primaryClass =

  22. [22]

    Astrobiology , keywords =

    Phosphine on Venus Cannot Be Explained by Conventional Processes. Astrobiology , keywords =. doi:10.1089/ast.2020.2352 , archivePrefix =. 2009.06499 , primaryClass =

  23. [23]

    Proceedings of the National Academy of Science , year = 2021, month = jul, volume =

    Volcanically extruded phosphides as an abiotic source of Venusian phosphine. Proceedings of the National Academy of Science , year = 2021, month = jul, volume =. doi:10.1073/pnas.2021689118 , adsurl =

  24. [24]

    European Planetary Science Congress , year = 2022, month = sep, eid =

    Abiotic chemical routes towards the phosphine synthesis in the atmosphere of Venus. European Planetary Science Congress , year = 2022, month = sep, eid =. doi:10.5194/epsc2022-198 , adsurl =

  25. [25]

    Astrobiology , year = 2024, month = apr, volume =

    A Novel Abiotic Pathway for Phosphine Synthesis over Acidic Dust in Venus' Atmosphere. Astrobiology , year = 2024, month = apr, volume =. doi:10.1089/ast.2023.0046 , adsurl =

  26. [26]

    Nature , year =

    Cocconi, Giuseppe and Morrison, Philip , title =. Nature , year =

  27. [27]

    arXiv e-prints , keywords =

    Arecibo Wow! I: An Astrophysical Explanation for the Wow! Signal. arXiv e-prints , keywords =. doi:10.48550/arXiv.2408.08513 , archivePrefix =. 2408.08513 , primaryClass =

  28. [28]

    Space Telescopes and Instrumentation 2024: Optical, Infrared, and Millimeter Wave , year = 2024, editor =

    The Habitable Worlds Observatory engineering view: status, plans, and opportunities. Space Telescopes and Instrumentation 2024: Optical, Infrared, and Millimeter Wave , year = 2024, editor =. doi:10.1117/12.3018328 , adsurl =

  29. [29]

    Large Interferometer For Exoplanets (LIFE). I. Improved exoplanet detection yield estimates for a large mid-infrared space-interferometer mission. , keywords =. doi:10.1051/0004-6361/202140366 , archivePrefix =. 2101.07500 , primaryClass =

  30. [30]

    Astrobiology , keywords =

    On the Possibility of an Artificial Origin for `Oumuamua. Astrobiology , keywords =. doi:10.1089/ast.2021.0193 , archivePrefix =. 2110.15213 , primaryClass =

  31. [31]

    , keywords =

    Deconstructing Alien Hunting. , keywords =. doi:10.3847/1538-3881/ad0cbe , archivePrefix =. 2311.08476 , primaryClass =

  32. [32]

    Astrobiology , keywords =

    False Positives and the Challenge of Testing the Alien Hypothesis. Astrobiology , keywords =. doi:10.1089/ast.2023.0005 , archivePrefix =. 2207.00634 , primaryClass =

  33. [33]

    Astrobiology , keywords =

    Exoplanet Biosignatures: A Framework for Their Assessment. Astrobiology , keywords =. doi:10.1089/ast.2017.1737 , archivePrefix =. 1705.06381 , primaryClass =

  34. [34]

    Astrobiology , keywords =

    Exoplanet Biosignatures: Future Directions. Astrobiology , keywords =. doi:10.1089/ast.2017.1738 , archivePrefix =. 1705.08071 , primaryClass =

  35. [35]

    Fode , title =

    Robert Rosenthal and Kermit L. Fode , title =. Psychological Reports , volume =. 1963 , doi =

  36. [36]

    Kullback and R

    S. Kullback and R. A. Leibler , journal =. On Information and Sufficiency , urldate =

  37. [37]

    Nature Communications , year = 2021, month = jan, volume =

    Identifying molecules as biosignatures with assembly theory and mass spectrometry. Nature Communications , year = 2021, month = jan, volume =. doi:10.1038/s41467-021-23258-x , adsurl =

  38. [38]

    arXiv e-prints , keywords =

    Exploring molecular assembly as a biosignature using mass spectrometry and machine learning. arXiv e-prints , keywords =. doi:10.48550/arXiv.2507.19057 , archivePrefix =. 2507.19057 , primaryClass =

  39. [39]

    European Planetary Science Congress , year = 2022, month = sep, eid =

    Potential long-term habitable conditions on planets with primordial H-He atmospheres. European Planetary Science Congress , year = 2022, month = sep, eid =. doi:10.5194/epsc2022-1139 , adsurl =

  40. [40]

    K., Ramirez, R., Kasting, J

    Habitable Zones around Main-sequence Stars: New Estimates. , keywords =. doi:10.1088/0004-637X/765/2/131 , archivePrefix =. 1301.6674 , primaryClass =

  41. [41]

    Journal of Astronomical Telescopes, Instruments, and Systems , keywords =

    Paths to robust exoplanet science yield margin for the Habitable Worlds Observatory. Journal of Astronomical Telescopes, Instruments, and Systems , keywords =. doi:10.1117/1.JATIS.10.3.034006 , archivePrefix =. 2405.19418 , primaryClass =

  42. [42]

    International Journal of Astrobiology , keywords =

    Do SETI optimists have a fine-tuning problem?. International Journal of Astrobiology , keywords =. doi:10.1017/S1473550424000235 , archivePrefix =. 2407.07097 , primaryClass =

  43. [43]

    Proceedings of the National Academy of Science , keywords =

    An objective Bayesian analysis of life's early start and our late arrival. Proceedings of the National Academy of Science , keywords =. doi:10.1073/pnas.1921655117 , archivePrefix =. 2005.09008 , primaryClass =

  44. [44]

    B., Garvin, E

    What if we Find Nothing? Bayesian Analysis of the Statistical Information of Null Results in Future Exoplanet Habitability and Biosignature Surveys. , keywords =. doi:10.3847/1538-3881/adb96d , archivePrefix =. 2504.06779 , primaryClass =

  45. [45]

    Dyson sphere candidates from Gaia DR3, 2MASS, and WISE

    Project Hephaistos - II. Dyson sphere candidates from Gaia DR3, 2MASS, and WISE. , keywords =. doi:10.1093/mnras/stae1186 , archivePrefix =. 2405.02927 , primaryClass =