arxiv: 2605.09300 · v1 · submitted 2026-05-10 · 📊 stat.ME

Recognition: no theorem link

Causal Stability Selection

Falco J. Bargagli-Stoffi, Omar Melikechi

Pith reviewed 2026-05-12 02:41 UTC · model grok-4.3

classification 📊 stat.ME

keywords causal inferencestability selectioneffect modificationfalse discovery controlconditional average treatment effectscross-fittingobservational data

0 comments

The pith

Causal stability selection produces a set of treatment effect modifiers with an explicit finite-sample bound on expected false positives.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a procedure to identify covariates that modify how a treatment affects an outcome. Current data-adaptive methods risk selecting spurious modifiers that fail to replicate because they lack finite-sample control over false discoveries. Causal stability selection integrates cross-fitted estimation of conditional average treatment effects with path stability selection. The resulting selection set carries a non-asymptotic bound on the expected number of false positives, and the selection probabilities converge to their ideal oracle values at the same rate as the underlying treatment effect estimator. This connection matters because it lets researchers discover effect modifiers in randomized trials or observational studies while maintaining explicit guarantees that hold for any finite sample size.

Core claim

Causal stability selection combines cross-fitted estimation of conditional average treatment effects with integrated path stability selection to produce a set of covariates that modify treatment effects, accompanied by an explicit bound on the expected number of false positives that holds in finite samples. Under standard causal identifying assumptions and regularity conditions on the base selector, the estimated selection probabilities converge to their oracle counterparts at the rate of the underlying treatment effect estimator. This establishes a direct connection between treatment effect estimation and effect modifier discovery.

What carries the argument

The causal stability selection algorithm that merges cross-fitted conditional average treatment effect estimation with integrated path stability selection to enforce the false-positive bound.

If this is right

The bound on expected false positives holds for any base treatment effect estimator that satisfies the required regularity conditions.
Selection probabilities converge to oracle values at the same rate as the treatment effect estimator.
The procedure applies equally to randomized experiments and observational studies once the identifying assumptions are met.
Non-asymptotic control removes reliance on large-sample approximations common in existing adaptive selection methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Applied researchers could rank candidate modifiers by their estimated stability probabilities to allocate follow-up resources more efficiently.
The explicit bound may help address replication concerns in fields that use effect-modifier searches to guide personalized interventions.
Similar stability-based bounds could be derived for other causal estimands such as mediation or principal strata effects.

Load-bearing premise

Standard causal identifying assumptions of consistency, no unmeasured confounding and positivity, together with regularity conditions on the base selector.

What would settle it

Apply the procedure to simulated data in which the true set of effect modifiers is known in advance and verify whether the realized number of false positives stays below the stated bound across repeated draws.

Figures

Figures reproduced from arXiv: 2605.09300 by Falco J. Bargagli-Stoffi, Omar Melikechi.

**Figure 1.** Figure 1: Performance of variable selection methods for effect modifier discovery. Top row: True positive rate (TPR), defined as the proportion of identified effect modifiers among all true effect modifiers, averaged over 200 simulation trials. Bottom row: False discovery rate (FDR), defined as the proportion of false discoveries among all selected covariates, averaged over 200 trials. Black diagonal dashed lines in… view at source ↗

**Figure 2.** Figure 2: Linear results: Confounding variables. Top and bottom rows show mean TPR and mean FDR (averaged over 200 trials); the diagonal dashed line indicates perfect nominal FDR control. Left: |C| = 0 (RCT). Middle: |C| = 5. Right: |C| = 10 [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗

**Figure 3.** Figure 3: Nonlinear results: Confounding variables. As in [PITH_FULL_IMAGE:figures/full_fig_p015_3.png] view at source ↗

read the original abstract

Identifying covariates that modify treatment effects is a central problem in causal inference. Yet existing data-adaptive procedures do not provide finite-sample control over the expected number of false discoveries, risking spurious findings that fail to replicate. We introduce causal stability selection, an algorithm that combines cross-fitted estimation of conditional average treatment effects with integrated path stability selection. The method accommodates arbitrary treatment effect estimators and arbitrary base selectors, and produces a selection set with an explicit, non-asymptotic bound on the expected number of false positives. Under standard causal identifying assumptions and regularity conditions on the base selector, we prove that the estimated selection probabilities converge to their oracle counterparts at the rate of the underlying treatment effect estimator. This establishes a direct connection between treatment effect estimation and effect modifier discovery. We illustrate the method on a randomized trial in oncology and on observational data on maternal smoking and infant birthweight.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Causal stability selection adapts the stability selection bound to cross-fitted CATE estimates and proves convergence to oracle selection probabilities at the estimator rate.

read the letter

The paper's main move is to take the standard stability selection framework, which bounds expected false positives through selection frequencies, and port it to causal effect-modifier discovery. They do this by using cross-fitted CATE estimates inside an integrated path stability procedure, and they supply both a non-asymptotic false-positive bound and a convergence result that ties the estimated selection probabilities to their oracle versions at the rate of the underlying treatment-effect estimator. That link is the part that feels useful in practice, because it makes the quality of the CATE step directly control the reliability of the selected modifiers. The method is written to accept arbitrary base estimators and selectors, which keeps it flexible. The oncology trial and maternal-smoking examples show that the procedure runs on real data without special tuning. The assumptions are the usual causal ones plus regularity conditions on the selector, and the stress-test confirms the derivation does not hide circularity or unjustified steps. The bound itself is explicit, which is the practical selling point for fields that need replicability guarantees. The main soft spot is that the finite-sample behavior of the bound is not explored through targeted simulations that vary the CATE estimator or the signal strength; the real-data illustrations are fine but do not substitute for that check. The regularity conditions on the base selector will also need case-by-case verification when people plug in complex learners. This is a paper for causal-inference researchers who already use CATE methods and want a selection step with some control on false discoveries. A reader working on variable selection in observational or trial data would get a concrete algorithm and a clear theoretical connection to estimation rates. It is worth sending to peer review because the core adaptation is cleanly executed and the guarantees address a gap that matters for applied work.

Referee Report

0 major / 3 minor

Summary. The manuscript introduces causal stability selection, which combines cross-fitted estimation of conditional average treatment effects (CATE) with integrated path stability selection. The central claims are that the procedure yields a selection set with an explicit non-asymptotic bound on the expected number of false positives and that, under standard causal identifying assumptions together with regularity conditions on the base selector, the estimated selection probabilities converge to their oracle counterparts at the rate of the underlying treatment effect estimator. The method is illustrated on a randomized oncology trial and observational data on maternal smoking and infant birthweight.

Significance. If the non-asymptotic bound and convergence result hold, the work supplies a rare finite-sample guarantee for effect-modifier discovery in causal inference, directly linking the accuracy of CATE estimation to the reliability of selection. The flexibility to accommodate arbitrary treatment-effect estimators and base selectors, together with the explicit connection to oracle quantities, strengthens the practical utility of the approach for reproducible causal findings.

minor comments (3)

The abstract and introduction refer to 'integrated path stability selection' without a one-sentence reminder of how the path is constructed from the base selector; a brief parenthetical would improve accessibility for readers outside the stability-selection literature.
Notation for the selection probability (estimated versus oracle) is introduced in the main text but could be typeset more distinctly (e.g., via consistent use of hats or superscripts) to avoid momentary confusion when the convergence statement is first stated.
The real-data illustrations would benefit from a short table or paragraph summarizing the selected modifiers, their estimated selection probabilities, and the implied false-positive bound for the chosen threshold; this would make the practical output of the method more immediately visible.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of the manuscript and for recommending minor revision. The referee's description accurately reflects the core contributions: the non-asymptotic false-positive bound for causal stability selection and the convergence of estimated selection probabilities to their oracle counterparts. No specific major comments were provided in the report, so we have no points requiring detailed rebuttal or revision at this stage. We will incorporate any minor editorial suggestions in the revised version.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper adapts the standard (non-self) stability selection bound on expected false positives to cross-fitted CATE estimates and proves convergence of selection probabilities to independently defined oracle quantities at the rate of the base treatment-effect estimator. Both the bound and the convergence result are stated under explicit regularity conditions on the base selector plus the usual causal identifying assumptions; neither step reduces by construction to a fitted parameter, a self-citation chain, or a redefinition of the target quantity. The derivation is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard causal identifying assumptions and regularity conditions on the base selector; these are domain assumptions imported from the literature rather than derived here.

axioms (2)

domain assumption Standard causal identifying assumptions (consistency, no unmeasured confounding, positivity)
Invoked to justify validity of CATE estimation and the selection procedure.
domain assumption Regularity conditions on the base selector
Required for the convergence of selection probabilities to oracle values.

pith-pipeline@v0.9.0 · 5437 in / 1311 out tokens · 55910 ms · 2026-05-12T02:41:01.946344+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

299 extracted references · 299 canonical work pages

[1]

Genetic Epidemiology , volume=

Stability selection for genome-wide association , author=. Genetic Epidemiology , volume=. 2011 , publisher=

work page 2011
[2]

Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining , pages=

Xgboost: A scalable tree boosting system , author=. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining , pages=

work page
[3]

2003 , publisher=

Storey, John D , journal=. 2003 , publisher=

work page 2003
[4]

Journal of the American Statistical Association , volume=

The adaptive lasso and its oracle properties , author=. Journal of the American Statistical Association , volume=. 2006 , publisher=

work page 2006
[5]

Journal of the American Statistical Association , volume=

Integrated path stability selection , author=. Journal of the American Statistical Association , volume=. 2026 , publisher=

work page 2026
[6]

Journal of the Royal Statistical Society, Series B (Statistical Methodology) , volume=

Stability selection , author=. Journal of the Royal Statistical Society, Series B (Statistical Methodology) , volume=. 2010 , publisher=

work page 2010
[7]

2013 , publisher=

Concentration Inequalities: A Nonasymptotic Theory of Independence , author=. 2013 , publisher=

work page 2013
[8]

2019 , publisher=

High-Dimensional Statistics: A Non-Asymptotic Viewpoint , author=. 2019 , publisher=

work page 2019
[9]

Annals of Statistics , volume=

False discovery rate control with unknown null distribution: A data-splitting approach , author=. Annals of Statistics , volume=

work page
[10]

Annals of Statistics , volume=

Controlling the false discovery rate via knockoffs , author=. Annals of Statistics , volume=

work page
[11]

Annals of Statistics , volume=

Simultaneous high-probability bounds on the false discovery proportion in structured, regression and online settings , author=. Annals of Statistics , volume=

work page
[12]

, title =

Kim, Been and Khanna, Rajiv and Koyejo, Oluwasanmi O. , title =. Advances in Neural Information Processing Systems , year =

work page
[13]

Artificial Intelligence , year =

Miller, Tim , title =. Artificial Intelligence , year =

work page
[14]

Journal of the Royal Statistical Society: Series B (Methodological) , volume=

Regression shrinkage and selection via the lasso , author=. Journal of the Royal Statistical Society: Series B (Methodological) , volume=

work page
[15]

Biometrics , volume=

A general statistical framework for subgroup identification and comparative treatment scoring , author=. Biometrics , volume=

work page
[16]

Journal of the Royal Statistical Society, Series B (Statistical Methodology) , volume=

Model selection and estimation in regression with grouped variables , author=. Journal of the Royal Statistical Society, Series B (Statistical Methodology) , volume=

work page
[17]

arXiv preprint arXiv:1705.08020 , year=

Selective inference for effect modification via the lasso , author=. arXiv preprint arXiv:1705.08020 , year=

work page arXiv
[18]

Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pages=

Interpretable decision sets: A joint framework for description and prediction , author=. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pages=

work page
[19]

Journal of Machine Learning Research , volume=

A Bayesian framework for learning rule sets for interpretable classification , author=. Journal of Machine Learning Research , volume=

work page
[20]

arXiv preprint arXiv:2502.07275 , year=

Distilling heterogeneous treatment effects: Stable subgroup estimation in causal inference , author=. arXiv preprint arXiv:2502.07275 , year=

work page arXiv
[21]

1996 , publisher=

Weak Convergence and Empirical Processes: With Applications to Statistics , author=. 1996 , publisher=

work page 1996
[22]

2016 , publisher=

Mathematical Foundations of Infinite-Dimensional Statistical Models , author=. 2016 , publisher=

work page 2016
[23]

Statistical Science , volume=

Microarrays, empirical Bayes and the two-groups model , author=. Statistical Science , volume=

work page
[24]

2010 , publisher=

Large-scale inference: empirical Bayes methods for estimation, testing, and prediction , author=. 2010 , publisher=

work page 2010
[25]

Journal of the American Statistical Association , volume=

Exploration of the false discovery rate in multiple testing under dependence , author=. Journal of the American Statistical Association , volume=

work page
[26]

Biometrika , volume=

The horseshoe estimator for sparse signals , author=. Biometrika , volume=

work page
[27]

Journal of the American Statistical Association , volume=

Oracle and adaptive compound decision rules for false discovery rate control , author=. Journal of the American Statistical Association , volume=

work page
[28]

Annals of Statistics , volume=

Asymptotic Bayes-optimality under sparsity of some multiple testing procedures , author=. Annals of Statistics , volume=

work page
[29]

Annals of Statistics , volume=

Needles and straw in a haystack: Posterior concentration for possibly sparse sequences , author=. Annals of Statistics , volume=. 2012 , publisher=

work page 2012
[30]

Electronic Journal of Statistics , volume=

The horseshoe estimator: Posterior concentration around nearly black vectors , author=. Electronic Journal of Statistics , volume=

work page
[31]

Journal of the American Statistical Association , volume=

The Spike-and-Slab LASSO , author=. Journal of the American Statistical Association , volume=. 2018 , publisher=

work page 2018
[32]

Journal of the Royal Statistical Society, Series B (Statistical Methodology) , volume=

Variable selection with error control: another look at stability selection , author=. Journal of the Royal Statistical Society, Series B (Statistical Methodology) , volume=. 2013 , publisher=

work page 2013
[33]

Annals of Statistics , volume=

Asymptotically minimax empirical Bayes estimation of a sparse normal mean vector , author=. Annals of Statistics , volume=. 2014 , publisher=

work page 2014
[34]

Econometrics Journal , volume=

Double/debiased machine learning for treatment and structural parameters , author=. Econometrics Journal , volume=

work page
[35]

Biometrics , volume=

Doubly robust estimation in missing data and causal inference models , author=. Biometrics , volume=

work page
[36]

Biostatistics , volume=

False discovery rates: a new deal , author=. Biostatistics , volume=

work page
[37]

Journal of the Royal Statistical Society, Series B (Statistical Methodology) , volume=

A direct approach to false discovery rates , author=. Journal of the Royal Statistical Society, Series B (Statistical Methodology) , volume=

work page
[38]

Journal of the Royal Statistical Society, Series B (Statistical Methodology) , volume=

The optimal discovery procedure: a new approach to simultaneous significance testing , author=. Journal of the Royal Statistical Society, Series B (Statistical Methodology) , volume=

work page
[39]

2025 , publisher=

Melikechi, Omar and Dunson, David B and Miller, Jeffrey W , journal=. 2025 , publisher=

work page 2025
[40]

Proceedings of the National Academy of Sciences , volume=

Metalearners for estimating heterogeneous treatment effects using machine learning , author=. Proceedings of the National Academy of Sciences , volume=. 2019 , publisher=

work page 2019
[41]

Biometrika , volume=

Quasi-oracle estimation of heterogeneous treatment effects , author=. Biometrika , volume=. 2021 , publisher=

work page 2021
[42]

American Journal of Epidemiology , volume=

Doubly robust estimation of causal effects , author=. American Journal of Epidemiology , volume=. 2011 , publisher=

work page 2011
[43]

Statistical Science , volume=

Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data , author=. Statistical Science , volume=. 2007 , publisher=

work page 2007
[44]

Biometrics , volume=

Covariate selection with group lasso and doubly robust estimation of causal effects , author=. Biometrics , volume=. 2018 , publisher=

work page 2018
[45]

Journal of causal inference , volume=

Targeted learning of the mean outcome under an optimal dynamic treatment rule , author=. Journal of causal inference , volume=. 2014 , publisher=

work page 2014
[46]

BMC bioinformatics , volume=

Controlling false discoveries in high-dimensional situations: boosting with stability selection , author=. BMC bioinformatics , volume=. 2015 , publisher=

work page 2015
[47]

International Statistical Review , volume=

Stable discovery of interpretable subgroups via calibration in causal studies , author=. International Statistical Review , volume=. 2020 , publisher=

work page 2020
[48]

Keith Battocchi and Eleanor Dillon and Maggie Hei and Greg Lewis and Paul Oka and Miruna Oprescu and Vasilis Syrgkanis , year=

work page
[49]

Electronic Journal of Statistics , volume=

Towards optimal doubly robust estimation of heterogeneous causal effects , author=. Electronic Journal of Statistics , volume=. 2023 , publisher=

work page 2023
[50]

arXiv preprint arXiv:2009.09036 , year=

Causal rule ensemble: Interpretable discovery and inference of heterogeneous treatment effects , author=. arXiv preprint arXiv:2009.09036 , year=

work page arXiv 2009
[51]

Journal of Computational and Graphical Statistics , volume=

Efficient augmentation and relaxation learning for individualized treatment rules using observational data , author=. Journal of Computational and Graphical Statistics , volume=. 2017 , publisher=

work page 2017
[52]

Nature Machine Intelligence , volume=

Stable learning establishes some common ground between causal inference and machine learning , author=. Nature Machine Intelligence , volume=. 2022 , publisher=

work page 2022
[53]

Journal of the Royal Statistical Society, Series B (Statistical Methodology) , volume=

Regularization and variable selection via the elastic net , author=. Journal of the Royal Statistical Society, Series B (Statistical Methodology) , volume=. 2005 , publisher=

work page 2005
[54]

Machine learning , volume=

Random forests , author=. Machine learning , volume=. 2001 , publisher=

work page 2001
[55]

Journal of the American Statistical Association , volume=

Estimation and inference of heterogeneous treatment effects using random forests , author=. Journal of the American Statistical Association , volume=. 2018 , publisher=

work page 2018
[56]

2015 , publisher=

Causal inference in statistics, social, and biomedical sciences , author=. 2015 , publisher=

work page 2015
[57]

American Economic Review , year =

Abadie, Alberto and Gardeazabal, Javier , title =. American Economic Review , year =

work page
[58]

The Review of Economic Studies , volume=

Semiparametric difference-in-differences estimators , author=. The Review of Economic Studies , volume=. 2005 , publisher=

work page 2005
[59]

Econometrica , volume=

Large sample properties of matching estimators for average treatment effects , author=. Econometrica , volume=. 2006 , publisher=

work page 2006
[60]

Journal of the American Statistical Association , volume=

Synthetic control methods for comparative case studies: Estimating the effect of California's tobacco control program , author=. Journal of the American Statistical Association , volume=. 2010 , publisher=

work page 2010
[61]

2011 , institution=

Robust Inference for Misspecified Models Conditional on Covariates , author=. 2011 , institution=

work page 2011
[62]

Journal of Business & Economic Statistics , volume=

Bias-corrected matching estimators for average treatment effects , author=. Journal of Business & Economic Statistics , volume=. 2011 , publisher=

work page 2011
[63]

Econometrica , volume=

On the failure of the bootstrap for matching estimators , author=. Econometrica , volume=. 2008 , publisher=

work page 2008
[64]

2014 , institution=

Finite Population Causal Standard Errors , author=. 2014 , institution=

work page 2014
[65]

American Journal of Political Science , pages=

Comparative Politics and the Synthetic Control Method , author=. American Journal of Political Science , pages=

work page
[66]

Clustering as a design problem , author=

work page
[67]

Econometrica , volume=

Matching on the estimated propensity score , author=. Econometrica , volume=. 2016 , publisher=

work page 2016
[68]

2017 , institution=

When should you adjust standard errors for clustering? , author=. 2017 , institution=

work page 2017
[69]

J , volume=

Randomization , author=. J , volume=

work page
[70]

Annual Review of Economics , volume=

Econometric methods for program evaluation , author=. Annual Review of Economics , volume=. 2018 , publisher=

work page 2018
[71]

Journal of Economic Literature , year=

Using synthetic controls: Feasibility, data requirements, and methodological aspects , author=. Journal of Economic Literature , year=

work page
[72]

Econometrica , volume=

Sampling-Based versus Design-Based Uncertainty in Regression Analysis , author=. Econometrica , volume=. 2020 , publisher=

work page 2020
[73]

American Economic Review: Insights , volume=

Statistical Nonsignificance in Empirical Economics , author=. American Economic Review: Insights , volume=

work page
[74]

Journal of the American Statistical Association , pages=

Robust post-matching inference , author=. Journal of the American Statistical Association , pages=. 2021 , publisher=

work page 2021
[75]

Advances in Neural Information Processing Systems , volume=

Variance reduction in bipartite experiments through correlation clustering , author=. Advances in Neural Information Processing Systems , volume=

work page
[76]

Handbook of econometrics , volume=

Econometric evaluation of social programs, part III: Distributional treatment effects, dynamic treatment effects, dynamic discrete choice, and general equilibrium policy evaluation , author=. Handbook of econometrics , volume=. 2007 , publisher=

work page 2007
[77]

Econometrica , volume=

The nonparametric identification of treatment effects in duration models , author=. Econometrica , volume=. 2003 , publisher=

work page 2003
[78]

1986 , publisher=

On the covariance structure of earnings and hours changes , author=. 1986 , publisher=

work page 1986
[79]

Econometrica: Journal of the Econometric Society , pages=

On the covariance structure of earnings and hours changes , author=. Econometrica: Journal of the Econometric Society , pages=. 1989 , publisher=

work page 1989
[80]

Handbook of econometrics , volume=

Econometric tools for analyzing market outcomes , author=. Handbook of econometrics , volume=. 2007 , publisher=

work page 2007

Showing first 80 references.