arxiv: 2602.15171 · v2 · submitted 2026-02-16 · 🌌 astro-ph.IM · astro-ph.SR

Recognition: no theorem link

A Response to paper Critical Evaluation of Studies Alleging Evidence for Technosignatures in the POSS1-E Photographic Plates by Watters et al. (2026)

Beatriz Villarroel , Alina Streblyanska , Stephen Bruehl , Stefan Geier

Authors on Pith no claims yet

Pith reviewed 2026-05-15 21:26 UTC · model grok-4.3

classification 🌌 astro-ph.IM astro-ph.SR

keywords technosignaturesphotographic platesPOSS1-EEarth shadow deficitensemble statisticshistorical astronomydata filteringstatistical critique

0 comments

The pith

The reported Earth-shadow deficit in historical sky plates holds up against a recent critique.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper defends earlier statistical findings of an Earth-shadow deficit in POSS1-E photographic plates against a critique that applies a much smaller, differently filtered subset of the data. It points out that the critique mixes checks on individual objects with overall ensemble statistics and uses a sample twenty times smaller that was built for another purpose, lacks complete timing information, and is underpowered for the test. The authors note that the horizontal separation metric requires the cos(Dec) factor for geometric accuracy and that the critique omits uncertainty estimates. They conclude the original principal findings remain intact.

Core claim

The principal findings of an Earth-shadow deficit reported in Villarroel et al. (2025) and Bruehl & Villarroel (2025) are not invalidated by the analyses in Watters et al. (2026), because those analyses rely on a reduced, heterogeneously filtered subset originally constructed for a different purpose, omit the cos(Dec) factor in plate assignment, lack uncertainty propagation, and are statistically underpowered for testing the ensemble-level signal.

What carries the argument

Ensemble-level statistical inference on the distribution of plate anomalies relative to the Earth-shadow boundary, contrasted with object-level validation on a reduced subset.

If this is right

The Earth-shadow deficit can continue to be treated as a candidate signal requiring follow-up with larger or better-characterized datasets.
Accurate plate-to-time assignment must include the cos(Dec) factor to avoid systematic errors in temporal reconstruction.
Future studies of archival plates should report uncertainty estimates and error propagation to allow direct comparison of results.
Visual purity checks on aggressively filtered subsets can be used to assess whether sample-size loss is justified by improved data quality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the deficit survives consistent re-analysis, it would motivate targeted searches for similar patterns in other historical plate archives.
Alternative explanations such as unrecognized plate defects or selection biases would need direct tests that preserve the full sample size.
The case illustrates a general need for consistent data-handling protocols when critiques re-use subsets built for separate questions.

Load-bearing premise

The original ensemble statistical signal remains detectable even after the data are reduced to a twenty-fold smaller subset filtered for a different scientific purpose.

What would settle it

A re-run of the Earth-shadow deficit test on the full original sample after applying the critique's exact filtering steps, including the cos(Dec) correction and uncertainty estimates, would show whether the deficit disappears.

read the original abstract

We respond to the critique by Watters et al. (2026) of the statistical analyses in Villarroel et al. (2025) and Bruehl & Villarroel (2025). We argue that the critique conflates object-level validation with ensemble-level statistical inference and relies on a reduced, heterogeneously filtered subset originally constructed for a different scientific purpose. We further question whether the aggressively filtered subset used in Watters et al. (2026) demonstrates a meaningful improvement in sample purity, given the twenty-fold reduction in sample size. Our simple, visual check does not suggest that it does. The subset further lacks complete temporal information and is seriously statistically underpowered for testing the reported Earth-shadow deficit. We emphasise that the horizontal separation metric used for plate assignment and time reconstruction as in Watters et al. (2026) depends on the inclusion of the cos(Dec) factor to ensure geometric consistency. Any omission would alter plate assignment and inferred observation times. Moreover, the analyses presented in Watters et al. (2026) do not include uncertainty estimates or error propagation, limiting the interpretability of the claimed null results. We conclude that the principal findings reported in Villarroel et al. (2025) and Bruehl & Villarroel (2025) are not invalidated by the analyses presented in Watters et al. (2026).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Response flags real issues with the critique's sample size and missing cos(Dec) term but does not re-run the ensemble statistic on corrected data.

read the letter

The main thing to know is that this paper argues the Watters critique fails to invalidate the earlier Earth-shadow deficit claims because it uses a much smaller, differently filtered subset that lacks temporal coverage and statistical power. It also notes that the horizontal separation metric needs the cos(Dec) factor for geometric accuracy and that the critique omits error propagation entirely. These points are logically coherent and worth raising in the exchange. The response does a solid job showing why object-level purity checks on a reduced sample do not directly test the original ensemble-level result. The visual inspection of purity is presented as informal evidence that the aggressive filtering did not deliver a clear gain. The soft spot is that the paper stops there. It identifies the problems with the critique but supplies no recomputed deficit significance on the full original sample using the corrected metric, no power analysis quantifying how the twenty-fold reduction affects detectability, and no propagated uncertainties on the claimed null. The non-invalidation conclusion therefore rests on the critique being methodologically mismatched rather than on a direct quantitative defense. This is relevant reading for anyone following technosignature work on archival plates or statistical handling of plate data. It deserves peer review so the specific points on sample construction and metric consistency can be checked against the prior papers.

Referee Report

3 major / 1 minor

Summary. The manuscript responds to Watters et al. (2026), which critiqued the statistical analyses in Villarroel et al. (2025) and Bruehl & Villarroel (2025) on potential technosignatures in POSS1-E plates. It argues that the critique conflates object-level validation with ensemble-level inference, relies on a twenty-fold smaller heterogeneously filtered subset lacking temporal completeness, omits the cos(Dec) factor in the horizontal separation metric for plate assignment, and provides no uncertainty estimates or error propagation. The authors conclude that the original principal findings of an Earth-shadow deficit are not invalidated.

Significance. If the arguments hold, the response would preserve the validity of the ensemble-level Earth-shadow deficit detection in the original studies, strengthening the case for further investigation of technosignatures. Credit is due for identifying the geometric necessity of the cos(Dec) term and the absence of error propagation in the critique, which are concrete methodological contributions. However, the lack of re-computed ensemble statistics limits the quantitative defense of the central claim.

major comments (3)

[Abstract / main text] Abstract and main text: The claim that the aggressively filtered subset is 'seriously statistically underpowered' for testing the Earth-shadow deficit is asserted without a power calculation, simulation of minimum detectable effect size, or comparison of the original versus reduced-sample significance levels, which is load-bearing for the non-invalidation conclusion.
[Main text on plate assignment] Discussion of horizontal separation metric: The manuscript correctly notes that the metric depends on the cos(Dec) factor for geometric consistency and that its omission would alter plate assignments and inferred times, yet it does not recompute the deficit statistic on the original Villarroel et al. (2025) sample using the corrected metric to quantify any change in the reported ensemble result.
[Conclusion] Conclusion: The assertion that the principal findings are not invalidated rests on the object-level versus ensemble-level distinction and the critique's subset limitations, but the response provides no explicit re-application of the ensemble test (with error propagation) to either the critique's subset or the full sample, leaving the robustness unquantified.

minor comments (1)

[Abstract] The 'simple, visual check' on sample purity is mentioned without specifying the visual criteria or quantitative contamination rates, which would improve reproducibility and allow readers to assess the claimed lack of improvement.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the careful and constructive review of our response to Watters et al. (2026). We address each major comment point by point below, providing the strongest substantive defense consistent with the manuscript while indicating where revisions will be incorporated.

read point-by-point responses

Referee: [Abstract / main text] The claim that the aggressively filtered subset is 'seriously statistically underpowered' for testing the Earth-shadow deficit is asserted without a power calculation, simulation of minimum detectable effect size, or comparison of the original versus reduced-sample significance levels, which is load-bearing for the non-invalidation conclusion.

Authors: We acknowledge that a formal power calculation would add quantitative precision. The twenty-fold reduction in sample size, combined with the subset's lack of complete temporal information and heterogeneous filtering for a different purpose, inherently limits statistical power for detecting the Earth-shadow deficit reported in the full ensemble of Villarroel et al. (2025). We will add a concise comparison of sample sizes and a qualitative assessment of the expected loss in sensitivity to the revised manuscript, while maintaining that this does not alter the conclusion that the critique's subset cannot invalidate the original ensemble-level result. revision: partial
Referee: [Main text on plate assignment] The manuscript correctly notes that the metric depends on the cos(Dec) factor for geometric consistency and that its omission would alter plate assignments and inferred times, yet it does not recompute the deficit statistic on the original Villarroel et al. (2025) sample using the corrected metric to quantify any change in the reported ensemble result.

Authors: The original analyses in Villarroel et al. (2025) and Bruehl & Villarroel (2025) already incorporated the geometrically correct horizontal separation metric that includes the cos(Dec) factor. The critique omitted this term, which invalidates their plate assignments and time inferences. Recomputing the ensemble statistic under the critique's incorrect metric would not be informative, as it would embed the same geometric error. The manuscript's point is that the critique's approach is methodologically flawed on this basis; no recomputation on the flawed metric is required to defend the original findings. revision: no
Referee: [Conclusion] The assertion that the principal findings are not invalidated rests on the object-level versus ensemble-level distinction and the critique's subset limitations, but the response provides no explicit re-application of the ensemble test (with error propagation) to either the critique's subset or the full sample, leaving the robustness unquantified.

Authors: The core distinction between object-level validation and ensemble-level inference remains central: the critique's reduced subset was constructed for a different purpose and lacks the temporal completeness needed for a meaningful re-application of the Earth-shadow test. The original ensemble analyses already include proper statistical treatment. We will revise the conclusion to explicitly restate that the critique provides no valid alternative ensemble test with error propagation, thereby reinforcing that the principal findings stand. revision: partial

standing simulated objections not resolved

Re-application of the ensemble test with error propagation to the critique's subset or the full sample to provide new quantitative robustness metrics

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's derivation chain consists of methodological objections to the critique: the use of a 20-fold smaller heterogeneously filtered subset, omission of the cos(Dec) factor in the horizontal separation metric, absence of temporal information, lack of statistical power, and missing uncertainty estimates or error propagation. These points are independent observations about the critique's construction and do not reduce any claimed result to a self-definition, a fitted parameter renamed as a prediction, or an ansatz imported from self-citation. Self-references to Villarroel et al. (2025) and Bruehl & Villarroel (2025) simply identify the original claims under discussion; the defense itself rests on external analysis of the critique rather than assuming those claims by construction or re-deriving them from the same inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The response rests on standard statistical requirements for ensemble inference, sample power, and error propagation rather than introducing new free parameters or entities.

axioms (1)

standard math Ensemble-level statistical inference on transients requires adequate sample size, complete temporal metadata, and propagated uncertainties to support claims of a deficit.
Invoked when criticizing the critique's twenty-fold reduction in sample size, missing temporal information, and absence of error estimates.

pith-pipeline@v0.9.0 · 5580 in / 1210 out tokens · 26806 ms · 2026-05-15T21:26:22.218106+00:00 · methodology

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Statistically Significant Linear Alignments Among High-Confidence Transient Candidates on POSS-I Photographic Plates
astro-ph.IM 2026-05 unverdicted novelty 6.0

Statistically significant linear alignments among high-confidence transient candidates on 1949-1957 photographic plates are detected, projecting to constant geographic longitudes with clustering near specific Earth sites.
Machine Learning Supports Existence of Previously Unrecognized Transient Astronomical Phenomena in Historical Observatory Images
astro-ph.IM 2026-04 unverdicted novelty 4.0

Machine learning classification strengthens evidence that transient point sources in pre-Sputnik plates are real phenomena, with elevated counts near nuclear tests and reduced counts in Earth's shadow.