{"total":14,"items":[{"citing_arxiv_id":"2605.18971","ref_index":22,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Shaping the Prior: How Synthetic Task Distributions Determine Tabular Foundation Model Quality","primary_cat":"cs.LG","submitted_at":"2026-05-18T18:00:45+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"O'Prior, a compositional synthetic prior with hierarchical SCMs, realism engines, stress modules, and curriculum protocols, improves tabular foundation model accuracy and robustness on real benchmarks when architecture and compute are held fixed.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.18383","ref_index":10,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"TabH2O: A Unified Foundation Model for Tabular Prediction","primary_cat":"cs.LG","submitted_at":"2026-05-18T13:27:54+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"TabH2O presents a unified tabular foundation model with dual-head architecture and single-stage pretraining that achieves an average rank of 2.55 on the TALENT benchmark, outperforming several established methods.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.15488","ref_index":37,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"SurvivalPFN: Amortizing Survival Prediction via In-Context Bayesian Inference","primary_cat":"cs.LG","submitted_at":"2026-05-15T00:13:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"SurvivalPFN amortizes Bayesian survival analysis for right-censored data by pretraining a prior-data fitted network on synthetic identifiable DGPs and then performing in-context inference, achieving competitive results on 61 real datasets.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"identifiability; under dependent censoring, the event-time distribution is not nonparametrically identifiable from observed data alone, and our method is not valid. Extending SurvivalPFN to identifiable dependent-censoring priors, such as copula methods [24, 105], is an important direction. Additionally, SurvivalPFN inherits the size-scalability trade-off of PFN-style models, with reduced relative performance on larger tables [37, 2]; improving long-context inference is left for future work. 9 References [1] Christophe Andrieu, Nando De Freitas, Arnaud Doucet, and Michael I Jordan. An introduction to mcmc for machine learning.Machine learning, 50(1):5-43, 2003. [2] Vahid Balazadeh, Hamidreza Kamkari, Valentin Thomas, Benson Li, Junwei Ma, Jesse C Cresswell, and Rahul G Krishnan."},{"citing_arxiv_id":"2605.12924","ref_index":25,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"IV-ICL: Bounding Causal Effects with Instrumental Variables via In-Context Learning","primary_cat":"cs.LG","submitted_at":"2026-05-13T03:00:08+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"IV-ICL learns the marginal posterior of causal effects via in-context learning to derive bounds as quantiles, recovering the identified set more reliably than variational inference while running 20-500x faster.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.12904","ref_index":15,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"VIP-COP: Context Optimization for Tabular Foundation Models","primary_cat":"cs.LG","submitted_at":"2026-05-13T02:28:31+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"VIP-COP is a black-box method that optimizes context for tabular foundation models by ranking and selecting high-value samples and features via online KernelSHAP regression, outperforming baselines on large high-dimensional data.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.12308","ref_index":28,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"In-context learning to predict critical transitions in dynamical systems","primary_cat":"cs.LG","submitted_at":"2026-05-12T15:56:03+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"TipPFN uses prior-data fitted networks and in-context learning on synthetic bifurcation data to detect proximity to critical transitions in unseen dynamical systems and real observations.","context_count":1,"top_context_role":"baseline","top_context_polarity":"baseline","context_text":"BaselinesWe compare TipPFN against classical EWS, including AR1, variance, and skewness in rolling windows, summarize their temporal trends with Kendall-τ [41], and obtain ROC curves by thresholding these trend scores, following standard practice [ 9]. ML baselines include Bury [ 22], Huang [18], and Zhuge [24], operating on univariate time series. TabPFN2.6 [28] provides a state-of- the-art ICL baseline; more details are provided in Appendix C.2. Querying procedure & scoringAll methods are evaluated on matched moving query windows ending ∆ time steps before the critical event. Uni-variate baselines were evaluated on the driving time series, zM(t). PFN-based models additionally receive context episodes, whereas classical and"},{"citing_arxiv_id":"2605.12292","ref_index":29,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"STRABLE: Benchmarking Tabular Machine Learning with Strings","primary_cat":"cs.LG","submitted_at":"2026-05-12T15:47:50+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"A new corpus of 108 mixed string-numeric tables shows that advanced tabular learners with basic string embeddings perform well on most real-world data, while large LLM encoders help on free-text heavy tables.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"LLM representations are reduced to 30 dimensions. Post-processing.We evaluate three distinct strategies within this stage: (1) Principal Component Analysis3 (PCA) on the embeddings of each string column with 30 PC (2) Standard scaling before PCA, which equalises per-dimension variance, preventing high-variance dimensions from dominating the principal components [29, 53] (3) Retain the first 30 embedding dimensions (No PCA). While being a natural choice for Matryoshka-trained models [35] like Qwen-3-8B, whose leading dimen- sions are optimized for semantic content, this strategy is applied across all encoders to match the dimensionality of the PCA-based pipelines. Learners.The resulting numerical tables are used as input to tabular learner of varying sophistica-"},{"citing_arxiv_id":"2605.10137","ref_index":14,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"PFN-TS: Thompson Sampling for Contextual Bandits via Prior-Data Fitted Networks","primary_cat":"stat.ML","submitted_at":"2026-05-11T07:46:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"PFN-TS converts PFN posterior predictives into mean-reward samples for Thompson sampling using a subsampled predictive CLT, with consistency proofs, regret bounds, and strong empirical performance on synthetic and real bandit benchmarks.","context_count":1,"top_context_role":"method","top_context_polarity":"use_method","context_text":"these approaches require hyperparameter choices, such as kernel bandwidth, network architecture, or tree depth, that are difficult to tune in an online setting. ∗Correspondence:yanshuo@nus.edu.sg Preprint. arXiv:2605.10137v1 [stat.ML] 11 May 2026 Recently, tabular foundation models (TFMs) have emerged as state-of-the-art for regression and classification on small-to-medium tabular datasets. Among TFMs, TabPFN [14] and TabICLv2 [22] stand out because they are implemented as amortized Bayesian engines, also known as Prior-Data Fitted Networks (PFNs) [ 19]: They are trained to approximate the Bayesian posterior predictive distributions (PPDs) arising from a broad prior over data-generating processes. When deployed on a new dataset, they approximate the PPD via a single forward pass, without any parameter updates,"},{"citing_arxiv_id":"2605.09424","ref_index":27,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Tabular Foundation Model for Generative Modelling","primary_cat":"cs.LG","submitted_at":"2026-05-10T08:52:28+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"TabFORGE generates high-quality synthetic tabular data by leveraging pretrained causality-aware representations in a two-stage diffusion-decoder architecture that mitigates latent distribution shifts.","context_count":1,"top_context_role":"method","top_context_polarity":"use_method","context_text":"construct a transferable and causality-aware latent space across heterogeneous datasets. Tokeniser Sψ handles heterogeneity without relying on metadata.Since feature sets often vary across tabular datasets, learning a dataset-specific latent space would substantially limit cross-dataset transferability. Therefore, TabFORGE follows the TabPFN tokenisation strategy [27] and maps any table into a shared k-dimensional latent space while preserving the identity of each feature, that is, X∈R N×(D+1) →T∈R N×(D+1)×k . The per-feature tokenisation is computed directly from feature values (further details are in Appendix B.2) and does not rely on metadata such as column names or descriptions, thereby enabling flexible cross-dataset training and inference."},{"citing_arxiv_id":"2605.07799","ref_index":14,"ref_count":2,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Toward Privileged Foundation Models:LUPI for Accelerated and Improved Learning","primary_cat":"cs.LG","submitted_at":"2026-05-08T14:36:12+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"PIQL integrates privileged information to accelerate convergence, lower loss, and improve generalization in tabular foundation models.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"[12] Songqiao Han, Xiyang Hu, Hailiang Huang, Minqi Jiang, and Yue Zhao. Adbench: Anomaly detection benchmark.Advances in Neural Information Processing Systems, 35, 2022. [13] Noah Hollmann, Samuel Müller, Katharina Eggensperger, and Frank Hutter. TabPFN: A transformer that solves small tabular classification problems in a second. InThe Eleventh International Conference on Learning Representations (ICLR), 2023. [14] Noah Hollmann, Samuel Müller, Lennart Purucker, Arjun Krishnakumar, Max Körfer, Shi Bin Hoo, Robin Tibor Schirrmeister, and Frank Hutter. Accurate predictions on small data with a tabular foundation model.Nature, 637(8045):319-326, 2025. [15] Bowen Jiang, Yuan Yuan, Maohao Shen, Zhuoqun Hao, Zhangchen Xu, Zichen Chen, Ziyi Liu, Anvesh Rao Vijjini, Jiashu He, Hanchao Yu, et al."},{"citing_arxiv_id":"2605.05993","ref_index":20,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"TabCF: Distributional Control Function Estimation with Tabular Foundation Models","primary_cat":"stat.ML","submitted_at":"2026-05-07T10:44:07+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"TabCF is a tuning-light method using tabular foundation models for control function regression to estimate distributional causal effects such as interventional means and quantiles.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.04911","ref_index":12,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Breaking the Quality-Privacy Tradeoff in Tabular Data Generation via In-Context Learning","primary_cat":"cs.LG","submitted_at":"2026-05-06T13:38:16+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"DiffICL breaks the quality-privacy tradeoff in small-data tabular synthesis by using in-context learning on pretrained structural priors to generate data that is both higher quality and less memorizing of training samples.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.03808","ref_index":75,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Agentic-imodels: Evolving agentic interpretability tools via autoresearch","primary_cat":"cs.AI","submitted_at":"2026-05-05T14:35:47+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Agentic-imodels evolves scikit-learn regressors via an autoresearch loop to jointly boost predictive performance and LLM-simulatability, improving downstream agentic data science tasks by up to 73% on the BLADE benchmark.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.04175","ref_index":48,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Uncertainty-Aware Foundation Models for Clinical Data","primary_cat":"cs.LG","submitted_at":"2026-04-05T16:44:13+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"The work introduces uncertainty-aware foundation models for clinical data by learning set-valued patient representations that enforce consistency across partial observations and integrate multimodal self-supervised objectives.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}