Recognition: no theorem link
Conformal Inference for Experimental Attrition in Social Science Research
Pith reviewed 2026-05-13 22:36 UTC · model grok-4.3
The pith
A conformal inference approach generates prediction intervals for treatment effects that remain valid even with participant attrition in experiments.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper introduces a conformal inference framework for experimental attrition that produces prediction intervals for treatment effects with guaranteed coverage under exchangeability conditions while achieving narrower widths than complete-case analysis, multiple imputation, or weighting methods, as shown in simulations and reanalyses of real experiments that allow subgroup comparisons across attrition patterns.
What carries the argument
Conformal prediction adapted to missing outcomes from attrition, which builds finite-sample valid prediction intervals by leveraging exchangeability without parametric models for the missingness mechanism.
If this is right
- Treatment effect estimates can be accompanied by intervals whose validity does not rest on strong assumptions about why participants leave the study.
- Researchers gain the ability to compare effect sizes for completers, attriters, and the full sample within one framework.
- Simulation evidence indicates higher coverage rates and shorter interval lengths than complete-case, imputation, or weighting approaches.
- The procedure supplies a direct route to robust causal statements in experiments where attrition is common.
Where Pith is reading between the lines
- The framework could be tested on longitudinal social science data to see whether intervals remain valid when attrition correlates with unobserved traits.
- Integration with existing survey weighting schemes might further tighten intervals without sacrificing the finite-sample guarantee.
- Application to non-experimental observational studies with similar missingness patterns would extend the reach beyond randomized trials.
- Checking coverage on hold-out samples from new experiments would provide a practical diagnostic for the exchangeability premise.
Load-bearing premise
The observations must satisfy the exchangeability conditions needed for conformal inference to deliver its coverage guarantee.
What would settle it
A dataset or simulation in which the produced intervals cover the true treatment effect at a rate below the nominal level or fail to be narrower than standard methods while preserving coverage.
Figures
read the original abstract
Attrition in survey and field experiments presents a challenge for social science research. Common approaches to deal with this problem -- such as complete case analysis, multiple imputation, and weighting methods -- rely on strong assumptions that may not hold in practice. This paper introduces a new method that combines recent advances in statistical inference with established tools for handling missing data. The approach produces prediction intervals for treatment effects that are both robust and precise. Evidence from simulation studies shows that the method achieves better coverage and produces narrower intervals than common alternatives. The reanalysis of two recently published experiment studies illustrates how this framework allows researchers to compare treatment effects across participants who remain in the study, those who drop out, and the full sample. Taken together, these results highlight how the proposed approach provides a stronger foundation for causal inference in the presence of attrition.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a method that integrates conformal prediction with tools for handling missing data to construct prediction intervals for treatment effects in randomized experiments subject to attrition. It claims the resulting intervals are valid and narrower than those from complete-case analysis, multiple imputation, or weighting, with supporting evidence from simulation studies and reanalyses of two published field experiments that compare effects among stayers, dropouts, and the full sample.
Significance. If the coverage guarantees survive the attrition adjustment, the approach would supply a finite-sample, distribution-free alternative to parametric missing-data methods that is directly useful for social-science experiments. The reported simulation gains in coverage and interval width, together with the empirical illustrations, indicate potential practical value once the exchangeability conditions are made explicit.
major comments (3)
- [§3.2] §3.2 (Conformal score construction): the paper does not specify whether nonconformity scores are computed on observed cases only, on imputed complete cases, or via a missingness-weighted score. Without this detail it is impossible to verify that the post-adjustment observations remain exchangeable, which is required for the marginal coverage claim.
- [§4.2] §4.2 (Simulation design): the reported coverage and width advantages are shown only for attrition mechanisms that appear to preserve exchangeability by construction. No results are given for MNAR processes that depend on potential outcomes, leaving open whether the coverage guarantee transfers to the most policy-relevant attrition patterns.
- [Theorem 1] Theorem 1 (Validity statement): the proof assumes exchangeability of the (possibly reweighted or imputed) sample, yet the manuscript provides no lemma or condition showing that the chosen missing-data adjustment restores this property when attrition is outcome-dependent.
minor comments (2)
- [Abstract] The abstract refers to “recent advances in statistical inference” without naming the specific conformal variant or missing-data technique; adding one sentence would improve readability.
- [Figure 2] Figure 2 (reanalysis panels): axis labels and legend entries for the three subgroups (stayers, dropouts, full sample) are inconsistent across panels; standardize notation.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which help sharpen the presentation of our assumptions and scope. We address each major point below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [§3.2] §3.2 (Conformal score construction): the paper does not specify whether nonconformity scores are computed on observed cases only, on imputed complete cases, or via a missingness-weighted score. Without this detail it is impossible to verify that the post-adjustment observations remain exchangeable, which is required for the marginal coverage claim.
Authors: We appreciate the referee highlighting this ambiguity. In the current manuscript the nonconformity scores are computed on the observed cases after inverse-probability weighting (under the MAR assumption) to restore exchangeability with the target population. We will revise §3.2 to state this procedure explicitly and add a short supporting lemma in the appendix showing that the weighted observed sample satisfies the exchangeability condition required for marginal coverage. revision: yes
-
Referee: [§4.2] §4.2 (Simulation design): the reported coverage and width advantages are shown only for attrition mechanisms that appear to preserve exchangeability by construction. No results are given for MNAR processes that depend on potential outcomes, leaving open whether the coverage guarantee transfers to the most policy-relevant attrition patterns.
Authors: The simulations cover attrition mechanisms typical in social-science experiments (MAR and MNAR conditional on observed covariates). We agree that MNAR depending directly on potential outcomes is policy-relevant; however, such mechanisms violate exchangeability even after standard adjustments, so the conformal guarantee does not apply. We will add an explicit discussion of this limitation in §4.2 and the concluding section, noting that sensitivity analyses would be needed for those cases. revision: partial
-
Referee: [Theorem 1] Theorem 1 (Validity statement): the proof assumes exchangeability of the (possibly reweighted or imputed) sample, yet the manuscript provides no lemma or condition showing that the chosen missing-data adjustment restores this property when attrition is outcome-dependent.
Authors: Theorem 1 is proved under the assumption that the adjusted sample is exchangeable, which holds when attrition is MAR. The manuscript does not claim validity for outcome-dependent MNAR. We will insert a new lemma in the appendix that formally establishes how inverse-probability weighting restores exchangeability under MAR, and we will clarify in the text that the coverage guarantee does not extend to attrition that depends on the potential outcomes themselves. revision: yes
Circularity Check
No significant circularity: method combines conformal prediction with standard missing-data tools without self-referential reductions
full rationale
The paper presents a methodological combination of conformal inference (for prediction intervals) with established missing-data techniques (imputation, weighting, complete-case analysis) to handle attrition in experiments. No equations, derivations, or fitted parameters are shown that reduce the claimed prediction intervals or coverage guarantees to inputs by construction. Simulation evidence and reanalyses of published studies are offered as external validation rather than tautological outputs. No load-bearing self-citations, uniqueness theorems imported from the authors' prior work, or ansatzes smuggled via citation appear in the abstract or described framework. The approach relies on the standard exchangeability assumption of conformal methods, which is an external requirement rather than a self-defined property, making the derivation self-contained against established statistical benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
arXiv preprint arXiv:2411.11824 , year=
Angelopoulos,AnastasiosN,RinaFoygelBarberandStephenBates.2024.“Theoreticalfoundations of conformal prediction.”arXiv preprint arXiv:2411.11824. Angelopoulos,AnastasiosNandStephenBates.2022. “Agentleintroductiontoconformalpredic- tion and distribution-free uncertainty quantification.”arXiv preprint arXiv:2107.07511. Athey, Susan, Raj Chetty and Guido Imbens
-
[2]
Combining Experimental and Observational Data to Estimate Treatment Effects on Long Term Outcomes
“Combining Experimental and Observational Data to Estimate Treatment Effects on Long Term Outcomes.”arXiv preprint arXiv:2006.09676. Chernozhukov, Victor, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whit- ney Newey and James Robins
-
[3]
Can Online Civic Education Induce Democratic Citizenship? Experimental Evidence from a New Democracy
“Can Online Civic Education Induce Democratic Citizenship? Experimental Evidence from a New Democracy.”American Journal of Political Science68(2):613–630. Fisher, R. A. 1937.The Design of Experiments.Oliver & Boyd, Edinburgh & London. Fukumoto,Kentaro.2022.“NonignorableAttritioninPairwiseRandomizedExperiments.”Political Analysis30(1):132–141. Gao, Chenyin...
work page 1937
-
[4]
On the Role of Surrogates in Conformal Inference of Individual Causal Effects
“On the Role of Surrogates in Conformal Inference of Individual Causal Effects.”arXiv preprint arXiv:2412.12365. 39 Gerber, A.S. and D.P. Green. 2012.Field Experiments: Design, Analysis, and Interpretation. W. W. Norton. URL:https://books.google.com/books?id=yxEGywAACAAJ Gohdes, Anita R
-
[5]
Holland,PaulW.1986.“StatisticsandCausalInference.”JournaloftheAmericanStatisticalAssociation 81(396):945–960. Honaker, James, Gary King and Matthew Blackwell
work page 1986
-
[6]
Causal Inference in the Social Sciences
“Causal Inference in the Social Sciences.”Annual Review of Statistics and Its Application11(Volume 11, 2024):123–152. 40 Imbens, Guido W and Donald B Rubin. 2015.Causal Inference in Statistics, Social, and Biomedical Sciences. Cambridge university press. Jin,Ying,ZhimeiRenandEmmanuelJ.Candès.2023.“SensitivityAnalysisofIndividualTreatment Effects: ARobustC...
work page 2024
-
[7]
Semiparametric Doubly Robust Targeted Double Machine Learning: A Review
“Semiparametric Doubly Robust Targeted Double Machine Learning: A Review.”arXiv preprint arXiv:2203.06469. King, Gary, James Honaker, Anne Joseph and Kenneth Scheve
-
[8]
“Regression Quantiles.”Econometrica46(1):33–50. Koenker, RogerandKevinF.Hallock.2001. “QuantileRegression.”JournalofEconomicPerspectives 15(4):143–156. LaLonde, Robert J
work page 2001
-
[9]
Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treat- ment Effects
“Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treat- ment Effects.”The Review of Economic Studies76(3):1071–1102. Lei, Jing, JamesRobinsandLarryWasserman.2013. “Distribution-FreePredictionSets.”Journalof the American Statistical Association108(501):278–287. Lei, Jing and Larry Wasserman
work page 2013
-
[10]
Distribution-Free Predictive Inference for Regression
“Distribution-Free Predictive Inference for Regression.”Journal of the American Statistical As- sociation113(523):1094–1111. Lei,LihuaandEmmanuelJ.Candès.2021.“ConformalInferenceofCounterfactualsandIndividual TreatmentEffects.”JournaloftheRoyalStatisticalSocietySeriesB:StatisticalMethodology83(5):911–
work page 2021
-
[11]
HowMarketsShapeValuesandPoliticalPreferences: A Field Experiment
Margalit,YotamandMosesShayo.2021. “HowMarketsShapeValuesandPoliticalPreferences: A Field Experiment.”American Journal of Political Science65(2):473–492. Mueller, Lisa
work page 2021
-
[12]
The Asymptotic Variance of Semiparametric Estimators
“The Asymptotic Variance of Semiparametric Estimators.”Econometrica 62(6):1349–1382. Romano,Yaniv,EvanPattersonandEmmanuelCandes.2019. ConformalizedQuantileRegression. InAdvances in Neural Information Processing Systems. Vol. 32 Curran Associates, Inc. Romano, Yaniv, Matteo Sesia and Emmanuel J. Candès
work page 2019
-
[13]
The Central Role of the Propensity Score in Observational Studies for Causal Effects
“The Central Role of the Propensity Score in Observational Studies for Causal Effects.”Biometrika70(1):41–55. Rubin,DonaldB.1974. “EstimatingCausalEffectsofTreatmentsinRandomizedandNonrandom- ized Studies.”Journal of Educational Psychology66(5):688–701. Rubin, Donald B
work page 1974
-
[14]
A Comparison of Some Conformal Quantile Regression Methods
“A Comparison of Some Conformal Quantile Regression Methods.”Stat9(1):e261. 42 Shimodaira,Hidetoshi.2000. “ImprovingPredictiveInferenceunderCovariateShiftbyWeighting the Log-Likelihood Function.”Journal of Statistical Planning and Inference90(2):227–244. Shin, Sooahn
work page 2000
-
[15]
Difference-in-Differences Design with Outcomes Missing Not at Random
“Difference-in-Differences Design with Outcomes Missing Not at Random.” arXiv preprint arXiv:2411.18772. Splawa-Neyman, Jerzy. 1990(1923). “On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9, transl. by D. M. Dabrowska and T. P. Speed.” Statistical Science5(4):465–472. Tibshirani, Ryan J, Rina Foygel Barbe...
-
[16]
Doubly Robust Calibration of Prediction Sets under Covariate Shift
“Doubly Robust Calibration of Prediction Sets under Covariate Shift.”Journal of the Royal Statistical Society Series B: Statistical Methodology86(4):943–965. 43 Appendix A Conformal Inference A.1 Marginal Coverage Theorem A.1.Suppose that(𝑋1, 𝑌1), . . . ,(𝑋𝑛+1 , 𝑌𝑛+1)are exchangeable and𝑠is a symmetric conformal score function. Then, the prediction interv...
work page 2024
-
[17]
The(1−𝛾)-quantile of the nonconformity score𝑉𝒞 =𝑉(𝑋 ,𝒞𝑖)for the attrition group𝑅=0is identified by the moment condition E 𝑚𝒞(𝜂𝛾,𝒞 , 𝑋 , 𝐷)|𝑅=0 −(1−𝛾)=0 ⇒E (1−𝑅) 𝑚𝒞(𝜂𝛾,𝒞 , 𝑋 , 𝐷)−(1−𝛾) =0. Consider a pathwise perturbation of the true distribution𝑃along a score function𝑠(𝒪), where 𝑠(𝒪)satisfies: E[𝑠(𝒪)]=0,E 𝑠(𝒪)2 <∞. The perturbed distribution is: 𝑃𝑡(𝒪)=(1+...
work page 2018
-
[18]
Step II: Counterfactual Inference on𝒵2
2:Estimate the propensity scoreˆ𝑒𝐷(𝑥)on𝒵1. Step II: Counterfactual Inference on𝒵2. 1:for𝑖in𝒵 2 with𝐷 𝑖 =1do 2:Compute ˆ𝑌L 𝑖(0),ˆ𝑌R 𝑖(0) in Algorithm D.2 on𝒵1 with level𝛼and𝑤 0(𝑥)=ˆ𝑒𝐷(𝑥) 1−ˆ𝑒𝐷(𝑥). 3:Compute𝒞 𝑖 = 𝑌𝑖(1)−ˆ𝑌R 𝑖(0), 𝑌𝑖(1)−ˆ𝑌L 𝑖(0) . 4:end for 5:for𝑖in𝒵 2 with𝐷 𝑖 =0do 6:Compute ˆ𝑌L 𝑖(1),ˆ𝑌R 𝑖(1) in Algorithm D.2 on𝒵1 with level𝛼and𝑤 1(𝑥)=1−ˆ𝑒𝐷(𝑥...
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.