Recognition: unknown
TabCF: Distributional Control Function Estimation with Tabular Foundation Models
Pith reviewed 2026-05-08 05:17 UTC · model grok-4.3
The pith
Tabular foundation models can be directly repurposed for control function regression to estimate interventional means and quantiles with little tuning.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
TabCF performs control function regression by leveraging tabular foundation models, enabling accurate, fast, and tuning-light estimation of interventional distributions such as means and quantiles; a copula-based approximation is proposed to handle dependence in multivariate outcome settings, with empirical results showing favorable comparisons to representative methods across synthetic and real data scenarios.
What carries the argument
TabCF, the direct use of tabular foundation models as control function regressors within the standard IV or CF identification strategy.
If this is right
- Estimation extends beyond average effects to full distributional causal quantities.
- Causal analysis requires less model-specific tuning than many existing CF or IV approaches.
- Identification remains transparent because the control function step follows standard theory.
- A copula step allows joint distributional estimates for multiple outcomes.
- The method serves as a ready baseline for comparing future causal estimators.
Where Pith is reading between the lines
- If foundation model quality keeps improving on tabular data, TabCF performance would improve without any retraining on the causal task.
- The same repurposing idea could apply in other causal settings where control functions are used, such as policy evaluation.
- Domains with frequent unmeasured confounding might adopt this style of estimator more readily once tabular foundation models become widely available.
Load-bearing premise
Tabular foundation models can be directly repurposed as accurate control function estimators without substantial additional fitting or tuning, and the copula approximation sufficiently captures dependence for multivariate outcomes.
What would settle it
A new dataset or simulation where known true interventional quantiles are available and TabCF estimates deviate substantially from them while heavily tuned alternatives recover the truth.
Figures
read the original abstract
Instrumental variable (IV) and control function (CF) methods are powerful tools for causal effect estimation in the presence of unmeasured confounding, yet most existing approaches target only mean effects and/or demand substantial fitting and tuning effort. In this paper, we introduce a simple method, TabCF, for control function regression using tabular foundation models, which enables accurate, fast, identification-transparent, and tuning-light causal estimation of distributional quantities, such as interventional means and quantiles; we also propose a copula-based approximation for multivariate outcomes. TabCF performs favorably against representative methods across a broad range of small- to medium-sized synthetic and real data scenarios. The central message is two-fold: for practitioners, it highlights that TabCF is an effective tool for distributional causal inference; for researchers, it suggests that the proposed approach could be considered a strong baseline for future method development. Code is available at https://github.com/GepingChen/TabCF.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces TabCF, a simple method for control function regression that repurposes tabular foundation models to estimate distributional causal quantities (interventional means and quantiles) under unmeasured confounding. It also proposes a copula-based approximation for multivariate outcomes and reports favorable performance relative to representative baselines across small- to medium-sized synthetic and real datasets. The central message positions TabCF as an effective practitioner tool and a strong baseline for future work, with code released for reproducibility.
Significance. If the identification and accuracy claims hold, the work could meaningfully lower the barrier to distributional IV estimation by reducing tuning and fitting demands through pre-trained models. This would be valuable for applied causal inference on tabular data and could establish a practical baseline, especially given the code release.
major comments (2)
- [§3] §3 (TabCF method): The central claim that pre-trained tabular foundation models can be plugged in directly as control-function regressors to recover unbiased interventional distributions hinges on the model accurately estimating the conditional distribution of treatment given instruments and covariates. The manuscript provides no derivation or diagnostic showing that the foundation-model pre-training objective aligns with this residual estimation task; without it, the CF correction may remain biased even when marginal predictions appear accurate.
- [§5] §5 (Experiments): The synthetic results are reported as favorable, yet the data-generating processes are not characterized with respect to the strength or form of unmeasured confounding, nor is there sensitivity analysis showing robustness when the foundation model's inductive biases are deliberately mismatched. This leaves open the possibility that reported gains reflect alignment with the pre-training distribution rather than general validity of the CF approach.
minor comments (2)
- [Abstract / §1] The term 'identification-transparent' is used in the abstract and introduction but is never formally defined or linked to a specific property of the estimator (e.g., explicit residual recovery or closed-form identification).
- [§5] Table captions and axis labels in the experimental figures should explicitly state the number of Monte Carlo replications and the precise metrics (e.g., bias, coverage, or quantile error) being plotted.
Simulated Author's Rebuttal
We thank the referee for the constructive and insightful comments. We address each major comment point by point below, providing our strongest honest defense while indicating planned revisions where appropriate.
read point-by-point responses
-
Referee: [§3] §3 (TabCF method): The central claim that pre-trained tabular foundation models can be plugged in directly as control-function regressors to recover unbiased interventional distributions hinges on the model accurately estimating the conditional distribution of treatment given instruments and covariates. The manuscript provides no derivation or diagnostic showing that the foundation-model pre-training objective aligns with this residual estimation task; without it, the CF correction may remain biased even when marginal predictions appear accurate.
Authors: We appreciate the referee highlighting the need for clearer justification of the first-stage alignment. Tabular foundation models are pre-trained on diverse tabular corpora specifically to capture conditional distributions P(T | covariates), which matches the control-function requirement of estimating the conditional distribution (or expectation) of treatment given instruments and covariates to form the residual. This is not an arbitrary plug-in; the pre-training objective directly supports accurate residual recovery for the CF correction. While the manuscript does not include a formal bias derivation (as the focus is on practical estimation), we will revise §3 to add a concise paragraph explaining this alignment based on the nature of tabular pre-training and include first-stage diagnostic metrics (e.g., prediction accuracy on held-out data) in the experiments to empirically verify the residual quality. revision: partial
-
Referee: [§5] §5 (Experiments): The synthetic results are reported as favorable, yet the data-generating processes are not characterized with respect to the strength or form of unmeasured confounding, nor is there sensitivity analysis showing robustness when the foundation model's inductive biases are deliberately mismatched. This leaves open the possibility that reported gains reflect alignment with the pre-training distribution rather than general validity of the CF approach.
Authors: We thank the referee for this valuable suggestion to bolster the experimental claims. The synthetic DGPs in §5 and the appendix vary structural equation coefficients to induce different intensities and forms of unmeasured confounding (e.g., via varying correlations between the latent confounder and treatment). However, we agree these were not explicitly quantified or subjected to targeted sensitivity checks for model mismatch. We will revise §5 to include explicit characterization of confounding strength (such as induced correlations) across DGPs, along with new sensitivity analyses that deliberately mismatch the foundation model's inductive biases (e.g., via ablated pre-training or alternative base models) to demonstrate that gains are not solely due to pre-training alignment. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper introduces TabCF as a method that repurposes pre-trained tabular foundation models for control function regression to estimate interventional distributions, with an added copula approximation for multivariate cases. No equations, derivations, or self-citations appear in the provided text that reduce any claimed prediction or result to an input quantity by construction. The central claims rest on empirical performance across synthetic and real data rather than tautological steps such as fitting a parameter and relabeling it as a prediction. The approach is presented as identification-transparent and tuning-light precisely because it delegates the core regression to external foundation models, avoiding internal circular reductions. This is the expected non-finding for a methods paper whose validity hinges on external benchmarks rather than self-referential math.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Robinson
Daron Acemoglu, Simon Johnson, and James A. Robinson. The Colonial Origins of Comparative Development: An Empirical Investigation.American Economic Review, 91(5):1369–1401, 2001
2001
-
[2]
Angrist, Guido W
Joshua D. Angrist, Guido W. Imbens, and Donald B. Rubin. Identification of Causal Effects Using Instrumental Variables.Journal of the American Statistical Association, 91(434):444–455,
-
[3]
doi: 10.1080/01621459.1996.10476902
-
[4]
A simple measure of conditional dependence.The Annals of Statistics, 49(6):3070–3102, 2021
Mona Azadkia and Sourav Chatterjee. A simple measure of conditional dependence.The Annals of Statistics, 49(6):3070–3102, 2021
2021
-
[5]
Cresswell, and Rahul G
Vahid Balazadeh, Hamidreza Kamkari, Valentin Thomas, Junwei Ma, Bingru Li, Jesse C. Cresswell, and Rahul G. Krishnan. CausalPFN: Amortized causal effect estimation via in- context learning. InAdvances in Neural Information Processing Systems, volume 38, 2025. URL https://openreview.net/forum?id=RblaNJGx8C
2025
-
[6]
Deep Generalized Method of Moments for Instrumental Variable Analysis
Andrew Bennett, Nathan Kallus, and Tobias Schnabel. Deep Generalized Method of Moments for Instrumental Variable Analysis. InAdvances in Neural Information Processing Systems, volume 32, 2019
2019
-
[7]
Black box causal inference: Effect estimation via meta prediction.arXiv:2503.05985, 2025
Lucius EJ Bynum, Aahlad Manas Puli, Diego Herrero-Quevedo, Nhi Nguyen, Carlos Fernandez- Granda, Kyunghyun Cho, and Rajesh Ranganath. Black box causal inference: Effect estimation via meta prediction.arXiv preprint arXiv:2503.05985, 2025
-
[8]
Using Geographic Variation in College Proximity to Estimate the Return to Schooling
David Card. Using Geographic Variation in College Proximity to Estimate the Return to Schooling. In L. N. Christofides, E. K. Grant, and R. Swidinsky, editors,Aspects of Labour Market Behaviour: Essays in Honour of John Vanderkamp, pages 201–222. University of Toronto Press, Toronto, 1995. 12
1995
-
[9]
Discovery and inference of a causal network with hidden confounding.Journal of the American Statistical Association, 119(548):2572–2584, 2024
Li Chen, Chunlin Li, Xiaotong Shen, and Wei Pan. Discovery and inference of a causal network with hidden confounding.Journal of the American Statistical Association, 119(548):2572–2584, 2024
2024
-
[10]
An IV model of quantile treatment effects
Victor Chernozhukov and Christian Hansen. An IV model of quantile treatment effects. Econometrica, 73(1):245–261, 2005
2005
-
[11]
Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. Double/debiased machine learning for treatment and structural parameters.The Econometrics Journal, 21(1):C1–C68, 2018. doi: 10.1111/ectj.12097
-
[12]
Nick Erickson, Lennart Purucker, Andrej Tschalzev, David Holzmüller, Prateek Mutalik Desai, David Salinas, and Frank Hutter. TabArena: A living benchmark for machine learning on tabular data. InAdvances in Neural Information Processing Systems, 2025. URL https: //arxiv.org/abs/2506.16791
-
[13]
Heckman, Costas Meghir, and Edward Vytlacil
Jean-Pierre Florens, James J. Heckman, Costas Meghir, and Edward Vytlacil. Identification of Treatment Effects Using Control Functions in Models with Continuous, Endogenous Treatment and Heterogeneous Effects.Econometrica, 76(5):1191–1206, 2008
2008
-
[14]
Testing for Imperfect Competition at the Fulton Fish Market.RAND Journal of Economics, 26(1):75–92, 1995
Kathryn Graddy. Testing for Imperfect Competition at the Fulton Fish Market.RAND Journal of Economics, 26(1):75–92, 1995
1995
-
[15]
Léo Grinsztajn, Klemens Flöge, Oscar Key, Felix Birkel, Philipp Jund, Brendan Roof, Benjamin Jäger, Dominik Safaric, Simone Alessi, Adrian Hayler, Mihir Manium, Rosen Yu, Felix Jablonski, Shi Bin Hoo, Anurag Garg, Jake Robertson, Magnus Bühler, Vladyslav Moroshan, Lennart Purucker, Clara Cornu, Lilly Charlotte Wehrhahn, Alessandro Bonetto, Bernhard Schölk...
-
[16]
Control function instrumental variable estimation of nonlinear causal effect models.Journal of Machine Learning Research, 17(100):1–35, 2016
Zijian Guo and Dylan S Small. Control function instrumental variable estimation of nonlinear causal effect models.Journal of Machine Learning Research, 17(100):1–35, 2016
2016
-
[17]
Deep IV: A flexible approach for counterfactual prediction
Jason Hartford, Greg Lewis, Kevin Leyton-Brown, and Matt Taddy. Deep IV: A flexible approach for counterfactual prediction. InProceedings of the 34th International Conference on Machine Learning, volume 70 ofProceedings of Machine Learning Research, pages 1414–1423, 2017
2017
-
[18]
Valid causal inference with (some) invalid instruments
Jason S Hartford, Victor Veitch, Dhanya Sridhar, and Kevin Leyton-Brown. Valid causal inference with (some) invalid instruments. InInternational Conference on Machine Learning, pages 4096–4106. PMLR, 2021
2021
-
[19]
TabPFN: A transformer that solves small tabular classification problems in a second
Noah Hollmann, Samuel Müller, Katharina Eggensperger, and Frank Hutter. TabPFN: A transformer that solves small tabular classification problems in a second. InInternational Conference on Learning Representations, 2023
2023
-
[20]
Accurate predictions on small data with a tabular foundation model.Nature, 637(8045):319–326, 2025
Noah Hollmann, Samuel Müller, Lennart Purucker, Arjun Krishnakumar, Max Körfer, Shi Bin Hoo, Robin Tibor Schirrmeister, and Frank Hutter. Accurate predictions on small data with a tabular foundation model.Nature, 637(8045):319–326, 2025
2025
-
[21]
Distri- butional Instrumental Variable Method.arXiv preprint arXiv:2502.07641, 2025
Anastasiia Holovchak, Sorawit Saengkyongam, Nicolai Meinshausen, and Xinwei Shen. Distri- butional Instrumental Variable Method.arXiv preprint arXiv:2502.07641, 2025. 13
-
[22]
Imbens and Joshua D
Guido W. Imbens and Joshua D. Angrist. Identification and Estimation of Local Average Treatment Effects.Econometrica, 62(2):467–475, 1994
1994
-
[23]
Imbens and Whitney K
Guido W. Imbens and Whitney K. Newey. Identification and Estimation of Triangular Simulta- neous Equations Models Without Additivity.Econometrica, 77(5):1481–1512, 2009
2009
-
[24]
Lucas Kook and Niklas Pfister. Instrumental Variable Estimation of Distributional Causal Effects.Electronic Journal of Statistics, 19(2):5249–5288, 2025. doi: 10.1214/25-EJS2460
-
[25]
Sokbae Lee. Endogeneity in Quantile Regression Models: A Control Function Approach.Journal of Econometrics, 141(2):1131–1158, 2007. doi: 10.1016/j.jeconom.2007.01.014
-
[26]
Cresswell, Keyvan Golestan, Guangwei Yu, Anthony L
Junwei Ma, Valentin Thomas, Rasa Hosseinzadeh, Alex Labach, Jesse C. Cresswell, Keyvan Golestan, Guangwei Yu, Anthony L. Caterini, and Maksims Volkovs. TabDPT: Scaling tabular foundation models on real data. InAdvances in Neural Information Processing Systems, 2025. URLhttps://openreview.net/forum?id=pIZxEOZCId
2025
-
[27]
Foundation Models for Causal Inference via Prior-Data Fitted Networks
Yuchen Ma, Dennis Frauen, Emil Javurek, and Stefan Feuerriegel. Foundation Models for Causal Inference via Prior-Data Fitted Networks. InInternational Conference on Learning Representations, 2026. URLhttps://openreview.net/forum?id=d2L1ndOKjq
2026
-
[28]
Transformers Can Do Bayesian Inference
Samuel Müller, Noah Hollmann, Santiago Pineda Arango, Josif Grabocka, and Frank Hutter. Transformers Can Do Bayesian Inference. InInternational Conference on Learning Representa- tions, 2022
2022
-
[29]
Statistical foundations of prior-data fitted networks
Thomas Nagler. Statistical foundations of prior-data fitted networks. InProceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Research, pages 25660–25676. PMLR, 2023. URLhttps://proceedings.mlr.press/ v202/nagler23a.html
2023
-
[30]
Whitney K. Newey and James L. Powell. Instrumental Variable Estimation of Nonparametric Models.Econometrica, 71(5):1565–1578, 2003. doi: 10.1111/1468-0262.00459
-
[31]
Cambridge University Press, 2nd edition, 2009
Judea Pearl.Causality: Models, Reasoning and Inference. Cambridge University Press, 2nd edition, 2009
2009
-
[32]
General Control Functions for Causal Effect Estimation from Instrumental Variables
Aahlad Manas Puli and Rajesh Ranganath. General Control Functions for Causal Effect Estimation from Instrumental Variables. InAdvances in Neural Information Processing Systems, volume 33, pages 8440–8451, 2020
2020
-
[33]
TabICL: A Tabular Foundation Model for In-Context Learning on Large Data
Jingang Qu, David Holzmüller, Gaël Varoquaux, and Marine Le Morvan. TabICL: A Tabular Foundation Model for In-Context Learning on Large Data. InProceedings of the 42nd Interna- tional Conference on Machine Learning, volume 267 ofProceedings of Machine Learning Research, pages 50817–50847. PMLR, 2025. URLhttps://proceedings.mlr.press/v267/qu25d.html
2025
-
[34]
Tabiclv2: A better, faster, scalable, and open tabular foundation model, 2026
Jingang Qu, David Holzmüller, Gaël Varoquaux, and Marine Le Morvan. TabICLv2: A better, faster, scalable, and open tabular foundation model.arXiv preprint arXiv:2602.11139, 2026. URLhttps://arxiv.org/abs/2602.11139
-
[35]
Do-PFN: In-context learning for causal effect estimation
Jake Robertson, Arik Reuter, Siyuan Guo, Noah Hollmann, Frank Hutter, and Bernhard Schölkopf. Do-PFN: In-context learning for causal effect estimation. InAdvances in Neural Information Processing Systems, volume 38, 2025. URL https://nips.cc/virtual/2025/ poster/118284. 14
2025
-
[36]
Exploiting Indepen- dent Instruments: Identification and Distribution Generalization
Sorawit Saengkyongam, Leonard Henckel, Niklas Pfister, and Jonas Peters. Exploiting Indepen- dent Instruments: Identification and Distribution Generalization. InProceedings of the 39th International Conference on Machine Learning, 2022
2022
-
[37]
Fonctions de répartition à n dimensions et leurs marges.Annales de l’ISUP, 8(3): 229–231, 1959
M Sklar. Fonctions de répartition à n dimensions et leurs marges.Annales de l’ISUP, 8(3): 229–231, 1959
1959
-
[38]
Stock and Mark W
James H. Stock and Mark W. Watson.Introduction to Econometrics. Addison Wesley, Boston, 2 edition, 2007
2007
-
[39]
Measuring and testing dependence by correlation of distances.Annals of Statistics, 35(6):2769–2794, 2007
GJ Székely, ML Rizzo, and NK Bakirov. Measuring and testing dependence by correlation of distances.Annals of Statistics, 35(6):2769–2794, 2007
2007
-
[40]
Two-stage residual inclusion estimation: addressing endogeneity in health econometric modeling.Journal of Health Economics, 27(3): 531–543, 2008
Joseph V Terza, Anirban Basu, and Paul J Rathouz. Two-stage residual inclusion estimation: addressing endogeneity in health econometric modeling.Journal of Health Economics, 27(3): 531–543, 2008
2008
-
[41]
Caterini
Valentin Thomas, Junwei Ma, Rasa Hosseinzadeh, Keyvan Golestan, Guangwei Yu, Maksims Volkovs, and Anthony L. Caterini. Retrieval & fine-tuning for in-context tabular models. In Advances in Neural Information Processing Systems, 2024. URLhttps://openreview.net/ forum?id=337dHOexCM
2024
-
[42]
implicit
Xiyuan Zhang, Danielle C. Maddix, Junming Yin, Nick Erickson, Abdul Fatir Ansari, Boran Han, Shuai Zhang, Leman Akoglu, Christos Faloutsos, Michael W. Mahoney, Cuixiong Hu, Huzefa Rangwala, George Karypis, and Yuyang Wang. Mitra: Mixed Synthetic Priors for Enhancing Tabular Foundation Models. InAdvances in Neural Information Processing Systems, volume 38,...
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.