pith. machine review for the scientific record. sign in

arxiv: 2605.09726 · v1 · submitted 2026-05-10 · 🧮 math.ST · stat.ME· stat.TH

Recognition: no theorem link

On the Impossibility of Specification Testing of Interference Models Based on Exposure Mappings

Chao Gao, Christopher Harshaw, Fredrik S\"avje, Yitan Wang

Pith reviewed 2026-05-12 02:56 UTC · model grok-4.3

classification 🧮 math.ST stat.MEstat.TH
keywords specification testingexposure mappingsinterference modelscausal inferenceType I errorType II errorimpossibility resultrandomized experiments
0
0 comments X

The pith

Any specification test for an exposure mapping model has worst-case Type I and Type II errors summing to one when it must have power against larger models

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that specification tests for interference models based on exposure mappings cannot simultaneously control Type I error and achieve meaningful power against incorrect specifications. When a test is required to detect deviations from a given exposure mapping toward any strictly larger mapping, its worst-case Type I error rate plus worst-case Type II error rate equals one for every sample size. This bound holds even when outcomes are uniformly bounded and the alternatives are maximally separated from the null under randomized experiments. The result implies that such tests perform no better than a procedure that discards the data and rejects the null hypothesis with probability 1/2. The authors complement the impossibility result by constructing a uniformly consistent test that distinguishes the no-interference model from a network linear-in-means model.

Core claim

For any specification test of a given exposure mapping model that is required to have power against a strictly larger exposure mapping model, the supremum Type I error plus the supremum Type II error equals one. The result applies to all finite sample sizes, to outcomes taking values in [0,1], and to alternatives that differ from the null in the most extreme way permitted by the exposure mapping framework.

What carries the argument

An exposure mapping, which assigns each unit one of several discrete exposure levels determined by the full vector of treatment assignments across all units, together with the partial order on exposure mappings in which one mapping is larger than another when it induces a finer partition of units into exposure groups.

If this is right

  • Specification tests that do not restrict the alternative class cannot reliably detect misspecified interference models.
  • Useful tests require the analyst to commit in advance to a narrow pair of models rather than testing against all larger exposure mappings.
  • A uniformly consistent test exists for the specific case of distinguishing no interference from a network linear-in-means model.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The impossibility suggests that data-driven selection among interference models will generally require domain knowledge to limit the candidate models under consideration.
  • Similar limits may apply when testing other structured causal models that are defined by partitions or groupings induced by the treatment assignment.
  • Researchers could explore the weakest additional assumptions on the alternative class that restore the possibility of informative tests with controlled errors.

Load-bearing premise

The test must be required to have power against every possible larger exposure mapping model, with no further restrictions placed on how those alternatives differ from the null model.

What would settle it

A concrete test statistic and rejection threshold for which there exists at least one exposure mapping null model and one strictly larger alternative model such that the worst-case Type I error plus the worst-case Type II error is strictly less than one.

read the original abstract

In order to estimate causal effects in a randomized experiment where spillovers are suspected to occur, analysts must posit a model of interference. The most popular class of interference models are those based on exposure mappings. In practice, it is rarely clear which interference model accurately captures the true nature of spillovers in the experiment. In response, researchers have developed specification tests which seek to determine whether a given interference model is correctly specified. In this context, Type I error is the rejection rate when the interference model is actually correct and Type II error is the acceptance rate when the interference model is incorrectly specified. While existing tests have been explicitly constructed to control Type I error, their Type II error remains less well understood. In this paper, we provide a strong impossibility result: any specification test for an exposure mapping model which aims to have power against a larger exposure mapping model has worst-case Type I and Type II errors that sum to one. This means that no specification test can provide uniformly better performance than the naive test which discards all data and rejects the null at random. Our negative result holds for all sample sizes, for uniformly bounded outcomes, and for alternatives which are maximally separated from the null. Informative specification tests must therefore further restrict the alternative model against which they seek to attain power. To this end, we provide a uniformly consistent test for differentiating no-interference from a network-linear-in-means model.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper establishes a finite-sample impossibility result for specification tests of exposure mapping models in randomized experiments with interference. Any test that controls Type I error under a given exposure mapping null while attaining power against the class of all strictly larger exposure mappings has worst-case Type I error (supremum over null distributions) plus worst-case Type II error (supremum over the larger class) equal to exactly 1. This holds for every finite sample size n, uniformly bounded outcomes, and randomized designs, and is tight because the random-guess test achieves equality. The authors note that informative tests therefore require further restrictions on the alternative and supply one such example: a uniformly consistent test for no-interference versus a network linear-in-means model.

Significance. If the central impossibility result holds, it has clear significance for causal inference under interference: it shows that unrestricted power against larger exposure mappings cannot improve upon random guessing in the worst case, forcing researchers to impose additional structure on alternatives (as the authors do in their positive result). The finite-sample character, the minimal assumptions (bounded outcomes and randomized designs), and the explicit tightness example are strengths. The work directly informs the design of specification tests for network experiments that use exposure mappings.

minor comments (2)
  1. Abstract: the phrase 'alternatives which are maximally separated from the null' is used without an immediate cross-reference to the precise definition employed in the main impossibility theorem; adding a parenthetical pointer would improve readability.
  2. Introduction: the formal definition of an exposure mapping and the partial order used to define 'strictly larger' models appear only after several paragraphs of motivation; moving a concise statement of these objects earlier would help readers follow the subsequent impossibility claim.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of the manuscript and recommendation for minor revision. The referee's summary accurately captures both the impossibility result and the constructive example we provide.

Circularity Check

0 steps flagged

No significant circularity; impossibility result derived directly from error-rate definitions

full rationale

The central impossibility result follows from the definitions of Type I error (supremum rejection probability under the null exposure mapping) and Type II error (supremum acceptance probability under any strictly larger exposure mapping), together with the requirement that the test must control the former while attaining power against the unrestricted larger class. For any test, adversarial distributions can be constructed (under bounded outcomes and randomized designs) such that the sum of these worst-case errors equals 1; the random-guess test achieves equality, showing the bound is tight. This argument uses only the model definitions and finite-sample error-rate suprema; it does not rely on fitted parameters, self-citations, or any ansatz imported from prior work. The subsequent positive result (uniformly consistent test for no-interference versus network linear-in-means) is constructed explicitly under an additional restriction on the alternative and likewise contains no self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 3 axioms · 0 invented entities

The central claim rests on standard causal inference assumptions including randomized treatment assignment and uniformly bounded outcomes. No free parameters or new entities are introduced; the result is an impossibility derived from the exposure mapping framework itself.

axioms (3)
  • domain assumption Outcomes are uniformly bounded
    Explicitly stated as holding for the impossibility result in the abstract.
  • domain assumption The experiment is a randomized experiment
    Core setup for defining causal effects and exposure mappings.
  • domain assumption Interference is captured by exposure mappings
    The models under test are defined via exposure mappings.

pith-pipeline@v0.9.0 · 5561 in / 1383 out tokens · 48541 ms · 2026-05-12T02:56:16.149563+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages

  1. [1]

    Imbens , title =

    Susan Athey and Dean Eckles and Guido W. Imbens , title =. Journal of the American Statistical Association , volume =. 2018 , doi =

  2. [2]

    P. M. Aronow , title =. 2012 , journal =

  3. [3]

    P. M. Aronow and Cyrus Samii , title =. 2017 , journal =

  4. [4]

    Biometrika , volume =

    Basse, G and Feller, A and Toulis, P , title =. Biometrika , volume =. 2019 , doi =

  5. [5]

    Political Analysis , author=

    Reasoning about Interference Between Units: A General Framework , volume=. Political Analysis , author=. 2013 , pages=. doi:10.1093/pan/mps038 , number=

  6. [6]

    2009 , doi =

    Identification of peer effects through social networks , journal =. 2009 , doi =

  7. [7]

    2025 , eprinttype =

    Agnostic Characterization of Interference in Randomized Experiments , author=. 2025 , eprinttype =. 2410.13142 , eprintclass =

  8. [8]

    R. A. Fisher , title =

  9. [9]

    Mathematical Foundations of Infinite-Dimensional Statistical Models , publisher=

    Giné, Evarist and Nickl, Richard , year=. Mathematical Foundations of Infinite-Dimensional Statistical Models , publisher=

  10. [10]

    A General Design-Based Framework and Estimator for Randomized Experiments , year =

    Christopher Harshaw and Fredrik S. A General Design-Based Framework and Estimator for Randomized Experiments , year =. 2210.08698 , eprintclass =

  11. [11]

    2023 , eprinttype =

    Randomization Test for the Specification of Interference Structure , author=. 2023 , eprinttype =. 2301.05580 , eprintclass =

  12. [12]

    Elizabeth Halloran , title =

    Michael G Hudgens and M. Elizabeth Halloran , title =. Journal of the American Statistical Association , volume =. 2008 , doi =

  13. [13]

    2024 , eprinttype =

    Vardis Kandiros and Charilaos Pipis and Constantinos Daskalakis and Christopher Harshaw , title =. 2024 , eprinttype =. 2411.10908 , eprintclass =

  14. [14]

    , title =

    Manski, Charles F. , title =. The Review of Economic Studies , volume =. 1993 , doi =

  15. [15]

    Manski , journal =

    Charles F. Manski , journal =. Identification of treatment response with social interactions , volume =

  16. [16]

    On the Application of Probability Theory to Agricultural Experiments

    Neyman, Jerzy , journal =. On the Application of Probability Theory to Agricultural Experiments. 1923 , note =

  17. [17]

    Aronow , title =

    Elizabeth Levy Paluck and Hana Shepherd and Peter M. Aronow , title =. 2016 , journal =

  18. [18]

    Biometrika , volume =

    Pouget-Abadie, J and Saint-Jacques, G and Saveski, M and Duan, W and Ghosh, S and Xu, Y and Airoldi, E M , title =. Biometrika , volume =. 2019 , doi =

  19. [19]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =

    Puelz, David and Basse, Guillaume and Feller, Avi and Toulis, Panos , title =. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =. 2022 , doi =

  20. [20]

    2024 , eprinttype =

    Supriya Tiwari and Pallavi Basu , title =. 2024 , eprinttype =. 2403.16673 , eprintclass =

  21. [21]

    Assouad, Fano, and Le Cam

    Yu, Bin. Assouad, Fano, and Le Cam. Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics. 1997

  22. [22]

    Journal of the American Statistical Association , volume =

    Yao Zhang and Qingyuan Zhao , title =. Journal of the American Statistical Association , volume =. 2023 , doi =