A Conformal Selection Framework for Individual Treatment Beneficiaries with Auxiliary External Data
Pith reviewed 2026-07-01 03:57 UTC · model grok-4.3
The pith
Conformal p-values calibrated on RCT data allow FDR-controlled selection of treatment beneficiaries using models trained on external data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The framework reformulates CATE-based treatment-benefit selection as a multiple-testing problem. For each candidate, it tests whether the conditional treatment benefit exceeds a clinically meaningful threshold and constructs a conformal p-value using RCT-based calibration. These p-values are adjusted by the Benjamini-Hochberg procedure to control the false discovery rate among selected beneficiaries. External data can be used to train the treatment effect models while conformal calibration remains anchored in the RCT data.
What carries the argument
The conformal inference framework that constructs RCT-calibrated conformal p-values for testing if conditional treatment benefit exceeds a threshold, enabling FDR control via Benjamini-Hochberg adjustment.
Load-bearing premise
Conformal p-values constructed from RCT calibration remain valid for FDR control even when the treatment-effect model is trained on external data whose distribution may differ from the RCT.
What would settle it
A simulation study where the external data distribution differs substantially from the RCT and the realized false discovery rate among selected beneficiaries exceeds the target level.
Figures
read the original abstract
Identifying patients who are likely to benefit from a treatment is central to precision medicine and can guide follow-up trials, enrichment designs, and individualized decisions. Although randomized controlled trials (RCTs) provide evidence on efficacy, they are usually powered to estimate average treatment effects rather than patient-level benefit. Meanwhile, artificial intelligence and machine learning methods offer flexible tools for estimating heterogeneous treatment effects, especially when augmented by real-world data (RWD). However, in practice, these estimated effects are often translated into decisions through simple ranking or thresholding rules, which can ignore uncertainty and multiplicity when many patients are evaluated simultaneously. Motivated by this, we propose a model-agnostic conformal inference framework for uncertainty-aware beneficiary selection. The framework reformulates CATE-based treatment-benefit selection as a multiple-testing problem. For each candidate, we test whether the conditional treatment benefit exceeds a clinically meaningful threshold and construct a conformal p-value using RCT-based calibration. These p-values are then adjusted by the Benjamini-Hochberg procedure to control the false discovery rate (FDR) among selected beneficiaries. To improve efficiency when RCT sample sizes are limited, external data, such as RWD, can be used to train flexible treatment effect models, while conformal calibration remains anchored in the RCT data. It can be paired with conventional machine learning algorithms and emerging tabular foundation models. Simulations show that the framework maintains FDR control, with power depending on the base learner and external-data comparability. A case study in early-stage non-small-cell lung cancer illustrates how the method identifies candidate profiles with evidence of benefit from limited resection to reduce overtreatment.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a model-agnostic conformal inference framework for selecting individual treatment beneficiaries. It reformulates CATE-based selection as a multiple-testing problem, constructs conformal p-values calibrated exclusively on RCT data, and applies the Benjamini-Hochberg procedure to control FDR among selected beneficiaries. External data may be used to train the underlying treatment-effect model, while calibration and p-value construction remain anchored in the RCT; simulations are reported to confirm FDR control, and the method is illustrated in an early-stage non-small-cell lung cancer case study identifying profiles that may benefit from limited resection.
Significance. If the FDR guarantee holds, the framework supplies a statistically principled route to uncertainty-aware beneficiary selection that respects the limited size of RCTs while allowing flexible model training on external data. The explicit separation of training and calibration data, together with the reduction to standard conformal validity under RCT exchangeability, is a clean contribution that could be paired with existing CATE estimators or tabular foundation models.
minor comments (3)
- [Abstract] The abstract states that simulations confirm FDR control but supplies no quantitative results, error bars, or description of the simulation design; adding a brief summary table or figure reference would strengthen the claim.
- The construction of the conformal score and the precise definition of the conformal p-value (including any dependence on the estimated CATE) should be stated explicitly in the main text with an equation or algorithm box, even if the validity argument is standard.
- The lung-cancer case study would benefit from a table reporting the number of discoveries, estimated FDR, and a short description of the selected profiles to make the practical output concrete.
Simulated Author's Rebuttal
We thank the referee for the positive summary of our work and the recommendation for minor revision. The report lists no specific major comments under the MAJOR COMMENTS section.
Circularity Check
No significant circularity; derivation self-contained
full rationale
The framework separates model training on external data from conformal p-value calibration and FDR control on RCT data. Standard conformal validity (exchangeability of calibration and test points) holds independently of how the score function is obtained, so the FDR guarantee does not reduce to a fitted quantity defined by the same data. No self-definitional steps, fitted inputs renamed as predictions, or load-bearing self-citations appear in the provided description or abstract. The central claim rests on established conformal properties rather than internal redefinition.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
2025 , note =
Precision Medicine , howpublished =. 2025 , note =
2025
-
[2]
Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=
Conformal inference of counterfactuals and individual treatment effects , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2021 , publisher=
2021
-
[3]
Formal mode of statistical inference for causal effects , journal =. 1990 , issn =. doi:10.1016/0378-3758(90)90077-8 , url =
-
[4]
Journal of Statistical Software , author=
Nonparametric Machine Learning and Efficient Computation with Bayesian Additive Regression Trees: The BART R Package , volume=. Journal of Statistical Software , author=. 2021 , pages=. doi:10.18637/jss.v097.i01 , number=
-
[5]
, title =
Rubin, Donald B. , title =. Journal of Educational Psychology , year =
-
[6]
Statistical Science , volume=
Causal inference methods for combining randomized trials and observational studies: A review , author=. Statistical Science , volume=. 2024 , publisher=
2024
-
[7]
2018 , doi =
Kent, David M and Steyerberg, Ewout and van Klaveren, David , title =. 2018 , doi =. https://www.bmj.com/content/363/bmj.k4245.full.pdf , journal =
2018
-
[8]
Social Science & Medicine , year =
Deaton, Angus and Cartwright, Nancy , title =. Social Science & Medicine , year =. doi:10.1016/j.socscimed.2017.12.005 , pmid =
-
[9]
Investigating heterogeneity of effects and associations using interaction terms , journal =. 2018 , issn =. doi:https://doi.org/10.1016/j.jclinepi.2017.09.012 , url =
-
[10]
Dahabreh, Issa J. and Hayward, Rodney A. and Kent, David M. , title =. International Journal of Epidemiology , year =. doi:10.1093/ije/dyw125 , pmid =
-
[11]
2025 , eprint=
Robust Estimation and Inference in Hybrid Controlled Trials for Binary Outcomes: A Case Study on Non-Small Cell Lung Cancer , author=. 2025 , eprint=
2025
-
[12]
Bernoulli , number =
Shu Yang and Siyi Liu and Donglin Zeng and Xiaofei Wang , title =. Bernoulli , number =. 2025 , doi =
2025
-
[13]
Mao, Guangcai and Yang, Shu and Wang, Xiaofei , title =. Biometrics , year =. doi:10.1093/biomtc/ujaf131 , pmid =
-
[14]
2020 , eprint=
Conformal prediction intervals for the individual treatment effect , author=. 2020 , eprint=
2020
-
[15]
Candes , title =
Ying Jin and Emmanuel J. Candes , title =. Journal of Machine Learning Research , year =
-
[16]
Angus, Derek C. and Chang, Chung-Chou H. , title =. JAMA , year =. doi:10.1001/jama.2021.20552 , url =
-
[17]
Kent, David M. and Paulus, Jessica K. and van Klaveren, David and D'Agostino, Ralph and Goodman, Steve and Hayward, Rodney A. and Ioannidis, John P. A. and Patrick-Lake, Bray and Morton, Sally and Pencina, Michael and Raman, Gowri and Ross, Joseph S. and Selker, Harry P. and Varadhan, Ravi and Vickers, Andrew and Wong, John B. and Steyerberg, Ewout W. , t...
-
[18]
Proceedings of the National Academy of Sciences , volume=
Sensitivity analysis of individual treatment effects: A robust conformal inference approach , author=. Proceedings of the National Academy of Sciences , volume=. 2023 , publisher=
2023
-
[19]
The Lancet Diabetes & Endocrinology , year =
Baum, Aaron and Scarpa, Joseph and Bruzelius, Emilie and Tamler, Ronald and Basu, Sanjay and Faghmous, James , title =. The Lancet Diabetes & Endocrinology , year =. doi:10.1016/S2213-8587(17)30176-6 , pmid =
-
[20]
Inoue, Kosuke and Athey, Susan and Baicker, Katherine and Tsugawa, Yusuke , title =. BMJ , year =. doi:10.1136/bmj-2024-079377 , url =
-
[21]
, title =
Inoue, Kosuke and Adomi, Motohiko and Efthimiou, Orestis and Komura, Toshiaki and Tsutsumi, Yusuke and Fujii, Tomoko and Kondo, Naoki and Onishi, Akira and Furukawa, Toshi A. , title =. Journal of Clinical Epidemiology , volume =. 2024 , doi =
2024
-
[22]
The Annals of Applied Statistics , number =
Kosuke Imai and Marc Ratkovic , title =. The Annals of Applied Statistics , number =. 2013 , doi =
2013
-
[23]
Statistics in Medicine , author =
Lasso estimation of hierarchical interactions for analyzing heterogeneity of treatment effect , volume =. Statistics in Medicine , author =. 2021 , pages =. doi:10.1002/sim.9132 , language =
-
[24]
Statistical Methods in Medical Research , author =
Individual treatment effect prediction for amyotrophic lateral sclerosis patients , volume =. Statistical Methods in Medical Research , author =. 2018 , pages =. doi:10.1177/0962280217693034 , number =
-
[25]
, author=
A tutorial on conformal prediction. , author=. Journal of Machine Learning Research , volume=. 2008 , issn =
2008
-
[26]
Statistics in Medicine , author =
Risk controlled decision trees and random forests for precision. Statistics in Medicine , author =. 2022 , pages =. doi:10.1002/sim.9253 , number =
-
[27]
Journal of Clinical Oncology , volume =
Uno, Hajime and Claggett, Brian and Tian, Lu and Inoue, Eisuke and Gallo, Paul and Miyata, Toshio and Schrag, Deborah and Takeuchi, Masahiro and Uyama, Yoshiaki and Zhao, Lihui and Skali, Hicham and Solomon, Scott and Jacobus, Susanna and Hughes, Michael and Packer, Milton and Wei, Lee-Jen , title =. Journal of Clinical Oncology , volume =. 2014 , doi =
2014
-
[28]
Statistical Methods in Medical Research , volume =
Per Kragh Andersen and Maja Pohar Perme , title =. Statistical Methods in Medical Research , volume =. 2010 , doi =
2010
-
[29]
Journal of Statistical Software , author=
Event History Regression with Pseudo-Observations: Computational Approaches and an Implementation in R , volume=. Journal of Statistical Software , author=. 2022 , pages=. doi:10.18637/jss.v102.i09 , number=
-
[30]
and Conti, Massimo and Ashrafi, Ahmad S
Altorki, Nasser and Wang, Xiaofei and Damman, Bryce and Mentlick, Jennifer and Landreneau, Rodney and Wigle, Dennis and Jones, David R. and Conti, Massimo and Ashrafi, Ahmad S. and Liberman, Moishe and de Perrot, Marc and Mitchell, John D. and Keenan, Robert and Bauer, Thomas and Miller, Daniel and Stinchcombe, Thomas E. , title =. The Journal of Thoracic...
-
[31]
Segmentectomy versus lobectomy in small-sized peripheral non-small-cell lung cancer (JCOG0802/WJOG4607L): a multicentre, open-label, phase 3, randomised, controlled, non-inferiority trial , author =. The Lancet , year =. doi:10.1016/S0140-6736(21)02333-3 , url =
-
[32]
, title =
Tukey, John W. , title =. 1953 , note =
1953
-
[33]
H. Powering. Statistics in Medicine , year =. doi:10.1002/sim.70524 , month =
-
[34]
Benjamini, Yoav and Braun, Henry I. , title =. The Annals of Statistics , year =. doi:10.1214/aos/1043351247 , publisher =
-
[35]
Ranganathan, Priya and Pramesh, C. S. and Buyse, Marc , title =. Perspectives in Clinical Research , year =. doi:10.4103/2229-3485.179436 , pmid =
-
[36]
2026 , eprint=
Theoretical Foundations of Conformal Prediction , author=. 2026 , eprint=
2026
-
[37]
Statistics in Medicine , year =
Zhou, Tianjian and Ji, Yuan , title =. Statistics in Medicine , year =. doi:10.1002/sim.9191 , publisher =
-
[38]
Biometrika , volume=
Improving randomized controlled trial analysis via data-adaptive borrowing , author=. Biometrika , volume=. 2025 , publisher=
2025
-
[39]
Proceedings of the 42nd International Conference on Machine Learning , pages =
Enhancing Statistical Validity and Power in Hybrid Controlled Trials: A Randomization Inference Approach with Conformal Selective Borrowing , author =. Proceedings of the 42nd International Conference on Machine Learning , pages =. 2025 , editor =
2025
-
[40]
The Econometrics Journal , volume =
Chernozhukov, Victor and Chetverikov, Denis and Demirer, Mert and Duflo, Esther and Hansen, Christian and Newey, Whitney and Robins, James , title =. The Econometrics Journal , volume =. 2018 , month =. doi:10.1111/ectj.12097 , url =
-
[41]
Journal of the American Statistical Association , volume=
Distribution-free predictive inference for regression , author=. Journal of the American Statistical Association , volume=. 2018 , publisher=
2018
-
[42]
Biometrical Journal , author =
Identification of subpopulations with distinct treatment benefit rate using the. Biometrical Journal , author =. 2016 , keywords =. doi:10.1002/bimj.201500180 , number =
-
[43]
Bayesian. Bayesian Analysis , author =. 2020 , pages =. doi:10.1214/19-BA1195 , number =
-
[44]
American Journal of Epidemiology , author =
Performance. American Journal of Epidemiology , author =. 2021 , pmid =. doi:10.1093/aje/kwab220 , number =
-
[45]
The International Journal of Biostatistics , author =
The optimal dynamic treatment rule superlearner: considerations, performance, and application to criminal justice interventions , volume =. The International Journal of Biostatistics , author =. 2021 , pmid =. doi:10.1515/ijb-2020-0127 , number =
-
[46]
Annual Review of Statistics and Its Application , author =
Bayesian. Annual Review of Statistics and Its Application , author =. 2020 , note =. doi:10.1146/annurev-statistics-031219-041110 , language =
-
[47]
Proceedings of the National Academy of Sciences of the United States of America , author =
Recursive partitioning for heterogeneous causal effects , volume =. Proceedings of the National Academy of Sciences of the United States of America , author =. 2016 , pmid =. doi:10.1073/pnas.1510489113 , number =
-
[48]
Kennedy , title =
Edward H. Kennedy , title =. Electronic Journal of Statistics , number =. 2023 , doi =
2023
-
[49]
Nie, Xinkun and Wager, Stefan , title =. Biometrika , volume =. 2021 , month =. doi:10.1093/biomet/asaa076 , url =
-
[50]
https://doi.org/10.1073/pnas.1804597116 Publisher: Proceedings of the National Academy of Sciences
Metalearners for estimating heterogeneous treatment effects using machine learning , volume =. Proceedings of the National Academy of Sciences , author =. 2019 , pages =. doi:10.1073/pnas.1804597116 , number =
-
[51]
Statistics in Medicine , volume =
Lipkovich, Ilya and Svensson, David and Ratitch, Bohdana and Dmitrienko, Alex , title =. Statistics in Medicine , volume =. doi:https://doi.org/10.1002/sim.10167 , url =. https://onlinelibrary.wiley.com/doi/pdf/10.1002/sim.10167 , year =
-
[52]
Journal of the Royal Statistical Society
Controlling the false discovery rate: a practical and powerful approach to multiple testing , author=. Journal of the Royal Statistical Society. Series B (Methodological) , volume=. 1995 , publisher=
1995
-
[53]
The Annals of Statistics , volume=
The control of the false discovery rate in multiple testing under dependency , author=. The Annals of Statistics , volume=. 2001 , publisher=. doi:10.1214/aos/1013699998 , url=
-
[54]
The Annals of Statistics , volume=
Testing for outliers with conformal p-values , author=. The Annals of Statistics , volume=. 2023 , publisher=. doi:10.1214/22-AOS2240 , url=
-
[55]
Journal of the Royal Statistical Society Series B , author=
-investing: a procedure for sequential control of expected false discoveries , year=. Journal of the Royal Statistical Society Series B , author=. doi:10.1111/j.1467-9868.2007.00643.x , url=
-
[56]
Journal of the Royal Statistical Society: Series B (Statistical Methodology) , year =
Aharoni, Ehud and Rosset, Saharon , title =. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , year =. doi:10.1111/rssb.12048 , url =
-
[57]
The Annals of Statistics , volume =
Adel Javanmard and Andrea Montanari , title =. The Annals of Statistics , volume =. 2018 , month = apr, doi =
2018
-
[58]
2015 , eprint=
On Online Control of False Discovery Rate , author=. 2015 , eprint=
2015
-
[59]
Sequential Analysis , volume=
Sequential Tests of Multiple Hypotheses Controlling False Discovery and Nondiscovery Rates , author=. Sequential Analysis , volume=. 2020 , publisher=
2020
-
[60]
2017 , eprint=
Sequential Multiple Testing , author=. 2017 , eprint=
2017
-
[61]
2024 , eprint=
False Discovery Control in Multiple Testing: A Brief Overview of Theories and Methodologies , author=. 2024 , eprint=
2024
-
[62]
Statistical Methods in Medical Research , volume=
Online control of the False Discovery Rate in group-sequential platform trials , author=. Statistical Methods in Medical Research , volume=. 2022 , publisher=
2022
-
[63]
Online control of the false discovery rate with decaying memory , url =
Ramdas, Aaditya and Yang, Fanny and Wainwright, Martin J and Jordan, Michael I , booktitle =. Online control of the false discovery rate with decaying memory , url =
-
[64]
Proceedings of the 35th International Conference on Machine Learning (ICML 2018) , volume =
SAFFRON: an Adaptive Algorithm for Online Control of the False Discovery Rate , author =. Proceedings of the 35th International Conference on Machine Learning (ICML 2018) , volume =. 2018 , publisher =
2018
-
[65]
2025 , eprint=
ACS: An interactive framework for conformal selection , author=. 2025 , eprint=
2025
-
[66]
Lara Maleyeff and Shirin Golchi and Erica E. M. Moodie and Marie Hudson , title =. Biometrics , volume =. 2024 , doi =
2024
-
[67]
Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =
Liang, Ziyi and Sesia, Matteo and Sun, Wenguang , title =. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =. 2024 , month =. doi:10.1093/jrsssb/qkad138 , url =
-
[68]
The Annals of Statistics , number =
William Fithian and Lihua Lei , title =. The Annals of Statistics , number =. 2022 , doi =
2022
-
[69]
2025 , eprint=
Cross-World Assumption and Refining Prediction Intervals for Individual Treatment Effects , author=. 2025 , eprint=
2025
-
[70]
2025 , eprint=
Online Conformal Selection with Accept-to-Reject Changes , author=. 2025 , eprint=
2025
-
[71]
and Ahmad, Zaid and van der Laan, Mark , booktitle =
Alaa, Ahmed M. and Ahmad, Zaid and van der Laan, Mark , booktitle =. Conformal Meta-learners for Predictive Inference of Individual Treatment Effects , volume =
-
[72]
Proceedings of the First Conference on Causal Learning and Reasoning , pages =
Integrative R -learner of heterogeneous treatment effects combining experimental and observational studies , author =. Proceedings of the First Conference on Causal Learning and Reasoning , pages =. 2022 , editor =
2022
-
[73]
2022 , eprint=
Combining Observational and Randomized Data for Estimating Heterogeneous Treatment Effects , author=. 2022 , eprint=
2022
-
[74]
2024 , eprint=
Conformal Diffusion Models for Individual Treatment Effect Estimation and Inference , author=. 2024 , eprint=
2024
-
[75]
Mathematical Modelling , year =
A New Approach to Causal Inference in Mortality Studies with Sustained Exposure Periods---Application to Control of the Healthy Worker Survivor Effect , author =. Mathematical Modelling , year =
-
[76]
JAMA , year =
Analysis and interpretation of treatment effects in subgroups of patients in randomized clinical trials , author =. JAMA , year =
-
[77]
Biometrics , year =
Testing for qualitative interactions between treatment effects and patient subsets , author =. Biometrics , year =
-
[78]
Chen, Zonghao and Guo, Ruocheng and Ton, Jean-Francois and Liu, Yang , title =. 2024 , isbn =. doi:10.1145/3637528.3671976 , booktitle =
-
[79]
2025 , eprint=
Conformal Convolution and Monte Carlo Meta-learners for Predictive Inference of Individual Treatment Effects , author=. 2025 , eprint=
2025
-
[80]
2025 , eprint=
On the Role of Surrogates in Conformal Inference of Individual Causal Effects , author=. 2025 , eprint=
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.