Propensity Score Propagation: A General Framework for Design-Based Inference with Unknown Propensity Scores

Siyu Heng; Yanxin Shen; Zijian Guo

arxiv: 2601.13150 · v3 · pith:Q47KNYFTnew · submitted 2026-01-19 · 📊 stat.ME

Propensity Score Propagation: A General Framework for Design-Based Inference with Unknown Propensity Scores

Siyu Heng , Yanxin Shen , Zijian Guo This is my paper

Pith reviewed 2026-05-16 13:06 UTC · model grok-4.3

classification 📊 stat.ME

keywords design-based inferencepropensity scoresfinite populationobservational studiescoverage probabilityregeneration procedureunknown design probabilities

0 comments

The pith

Propensity score propagation achieves nominal coverage for design-based inference when propensity scores are unknown and estimated from data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework called propensity score propagation to enable valid design-based inference in settings where the assignment probabilities are unknown and must be estimated. It introduces a regeneration-and-union procedure that re-samples or re-estimates the propensity scores and combines the resulting design-based quantities to form intervals or tests that account for estimation uncertainty. This approach works with both parametric and nonparametric propensity score models and can be paired with existing design-based methods that assume the scores are known. Conventional plug-in or matching methods typically ignore or mishandle that uncertainty, producing intervals whose actual coverage falls below the nominal level. The new procedure restores nominal coverage by treating the estimated scores as random objects whose variability is propagated through the design-based calculation.

Core claim

The central claim is that a regeneration-and-union procedure applied to propensity score estimates produces design-based confidence intervals and tests that attain their nominal coverage rate in finite populations, even when the scores are estimated parametrically or nonparametrically and even in regimes where plug-in estimators exhibit substantial under-coverage.

What carries the argument

The regeneration-and-union procedure, which draws multiple realizations of the estimated propensity scores and unions the design-based inferences computed from each realization to form a final interval or test.

If this is right

The framework extends existing design-based methods developed for known propensity scores to the realistic case of estimated scores without requiring new outcome modeling assumptions.
It applies directly to observational studies, complex surveys, and missing-data problems that are governed by unknown design probabilities.
Theoretical results guarantee that the procedure attains nominal coverage for both parametric and nonparametric propensity score estimators.
Simulation evidence shows the method recovers nominal coverage in finite samples where conventional plug-in and matching approaches under-cover.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same regeneration logic could be tested on other nuisance parameters that appear in finite-population problems, such as estimated sampling weights.
Empirical checks on real data sets with known randomization mechanisms would provide an external validation of the coverage guarantee.
The approach may reduce the need for near-exact matching in observational studies by allowing more flexible propensity score models while still preserving design-based validity.

Load-bearing premise

The regeneration-and-union step correctly folds the variability of the estimated propensity scores into the design-based quantities without introducing bias or breaking the finite-population exactness properties.

What would settle it

A Monte Carlo experiment in which the empirical coverage of the propagated intervals falls materially below the nominal 95 percent level for a nonparametric propensity score estimator and a moderate sample size.

read the original abstract

Design-based inference, also known as randomization-based or finite-population inference, provides a principled framework for trustworthy statistical inference by attributing randomness solely to the design mechanism (e.g., treatment assignment, survey sampling, or missingness), without imposing super-population distributional or modeling assumptions on outcome data. From Fisher's and Neyman's seminal work to the recent resurgence of design-based inference, this perspective has played a central role in causal inference, survey sampling, and missing data analysis. However, a fundamental obstacle has limited its use in many modern applications: existing design-based inference theory typically relies on known propensity scores (i.e., known design probabilities), whereas propensity scores are usually unknown in observational studies, real-world survey settings, and missing data problems. We propose propensity score propagation, a general framework for valid design-based inference with unknown propensity scores. The framework introduces a regeneration-and-union procedure that propagates uncertainty from propensity score estimation into downstream design-based inference without imposing super-population outcome assumptions. It accommodates both parametric and nonparametric propensity score models, integrates seamlessly with existing design-based methods developed under known propensity scores, and applies broadly across design-based inference problems. Theoretical results and simulation studies show that the proposed framework achieves nominal coverage, even when existing approaches exhibit substantial under-coverage.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The regeneration-and-union procedure gives a workable way to fold propensity score estimation uncertainty into design-based estimators, with simulation gains over plug-ins, but the finite-population unbiasedness claim looks weaker for nonparametric models.

read the letter

The main takeaway is that the authors introduce a regeneration-and-union step to propagate uncertainty from estimated propensity scores into standard design-based estimators such as Horvitz-Thompson. This keeps the randomization-based interpretation intact even when the scores are unknown, which is the usual situation in observational data. The procedure works for both parametric and nonparametric propensity models and slots into existing design-based tools without major rewriting. Simulations show it reaches closer to nominal coverage than simple plug-in or matching approaches in the settings they examine, which is a practical improvement over the under-coverage problems noted in the abstract. The framework is presented as general across design-based problems, and the authors supply both theoretical arguments and numerical checks to back the coverage claim. That combination is the clearest contribution. The soft spot is the exact finite-population unbiasedness for nonparametric propensity estimators. The regeneration step adds randomness from the estimation procedure, and the union then mixes that with the original design randomness. It is not obvious that the joint distribution leaves the original design probabilities untouched, so the expectation may not equal the target parameter exactly. If the proofs are mainly asymptotic normality or coverage rather than an exact unbiasedness result, the strict design-based guarantee is diluted. The abstract asserts theoretical support, but the nonparametric case would benefit from tighter conditions. Simulation details such as estimator choices or exclusion rules are secondary but worth checking in a review. This paper is aimed at statisticians who already work with design-based methods in causal inference or survey sampling and want to handle unknown propensity scores without switching to model-based inference. A reader comfortable with Horvitz-Thompson and Hajek estimators will see the value quickly. It deserves a serious referee because the gap it targets is real, the simulations are relevant, and the core idea is straightforward enough to evaluate and improve.

Referee Report

2 major / 1 minor

Summary. The paper proposes propensity score propagation, a general framework using a regeneration-and-union procedure to propagate uncertainty from estimating unknown propensity scores (parametric or nonparametric) into design-based inference. It integrates with existing methods for known propensities and claims theoretical and simulation support for achieving nominal coverage where plug-in and matching approaches under-cover.

Significance. If the regeneration-and-union step preserves finite-population design properties while correctly accounting for propensity estimation uncertainty, the framework would provide a flexible, general solution to under-coverage in design-based inference for observational data, surveys, and missingness settings, extending beyond parametric restrictions in prior finite-population M-estimation work.

major comments (2)

[Abstract] Abstract: The claim of nominal coverage from 'theoretical studies' for nonparametric propensity score models is central; however, it is unclear whether exact design-unbiasedness (e.g., for Horvitz-Thompson or Hajek estimators) is established or only asymptotic normality, given that regenerated weights from nonparametric estimation introduce randomness whose joint distribution with the original design may not preserve the finite-population probabilities.
[Theoretical development (presumably §3–4)] Theoretical development (presumably §3–4): The regeneration-and-union procedure must be shown to leave the expectation of the propagated estimator equal to the target parameter under the true design even for nonparametric propensity models; without an explicit argument addressing the non-deterministic nature of the estimated weights, the finite-population guarantee is at risk of being weaker than stated.

minor comments (1)

[Notation] Notation throughout: Explicitly distinguish the regenerated propensity weights from the original design weights in the union step to prevent reader confusion about which randomness is being propagated.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments, which have helped us clarify the scope of our theoretical guarantees. We address each major point below and have revised the manuscript accordingly to improve precision regarding asymptotic versus exact properties.

read point-by-point responses

Referee: [Abstract] Abstract: The claim of nominal coverage from 'theoretical studies' for nonparametric propensity score models is central; however, it is unclear whether exact design-unbiasedness (e.g., for Horvitz-Thompson or Hajek estimators) is established or only asymptotic normality, given that regenerated weights from nonparametric estimation introduce randomness whose joint distribution with the original design may not preserve the finite-population probabilities.

Authors: We appreciate this observation and agree that the distinction is important. Our theoretical results in Sections 3 and 4 establish asymptotic normality of the propagated estimators under the finite-population design, with the regeneration-and-union procedure incorporating the additional variability from nonparametric propensity estimation. We do not claim exact finite-sample design-unbiasedness for the nonparametric case, as the estimated weights are stochastic. We have revised the abstract to specify 'asymptotic nominal coverage' and added a clarifying sentence in the introduction to distinguish this from exact unbiasedness. revision: yes
Referee: [Theoretical development (presumably §3–4)] Theoretical development (presumably §3–4): The regeneration-and-union procedure must be shown to leave the expectation of the propagated estimator equal to the target parameter under the true design even for nonparametric propensity models; without an explicit argument addressing the non-deterministic nature of the estimated weights, the finite-population guarantee is at risk of being weaker than stated.

Authors: We agree that an explicit argument addressing the non-deterministic weights is necessary. In the revised Section 3, we have expanded the proof to show that the regeneration step preserves the conditional expectation of the estimator given the estimated propensities, while the union step averages over multiple regenerated designs. For nonparametric models, the argument proceeds via the law of total expectation combined with consistency of the propensity estimator, yielding asymptotic unbiasedness and coverage rather than exact finite-sample unbiasedness under the true design. We have also added a remark noting this limitation explicitly. revision: yes

Circularity Check

0 steps flagged

No circularity: regeneration-and-union procedure is an independent construction integrating with existing design-based methods

full rationale

The paper proposes a regeneration-and-union procedure as a new framework to propagate uncertainty from propensity score estimation (parametric or nonparametric) into downstream design-based inference. This is presented as integrating with existing methods under known propensity scores rather than redefining or fitting quantities by construction. No equations or steps in the abstract reduce a claimed prediction or result to its own inputs (e.g., no fitted parameter renamed as prediction, no self-definitional loop). Theoretical claims of nominal coverage are asserted via the new procedure without evidence of tautological reduction. Self-citations, if present, are not load-bearing for the central claim per the provided description. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on standard design-based inference principles (randomness only from the design) plus the validity of the new regeneration-and-union step, which is introduced without independent external verification in the abstract.

axioms (1)

domain assumption Design-based inference attributes randomness solely to the design mechanism without distributional assumptions on outcomes.
Explicitly stated in the abstract as the foundational framework.

invented entities (1)

propensity score propagation framework no independent evidence
purpose: To propagate uncertainty from propensity score estimation into design-based inference
New procedure introduced by the paper; no independent evidence outside the paper's claims is provided in the abstract.

pith-pipeline@v0.9.0 · 5525 in / 1241 out tokens · 29905 ms · 2026-05-16T13:06:43.567345+00:00 · methodology

Propensity Score Propagation: A General Framework for Design-Based Inference with Unknown Propensity Scores

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)