hub Baseline reference

TabICLv2: A better, faster, scalable, and open tabular foundation model.arXiv preprint arXiv:2602.11139

Jingang Qu, David Holzmüller, Gaël Varoquaux, Marine Le Morvan · 2026 · arXiv 2602.11139

Baseline reference. 57% of citing Pith papers use this work as a benchmark or comparison.

35 Pith papers citing it

Baseline 57% of classified citations

read on arXiv browse 35 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

baseline 3 background 2 dataset 1 method 1

citation-polarity summary

baseline 3 background 2 use dataset 1 use method 1

representative citing papers

Causal Foundation Models with Continuous Treatments

cs.LG · 2026-05-14 · unverdicted · novelty 8.0

A transformer foundation model is trained on synthetic data from a novel prior over continuous-treatment data-generating processes to predict treatment-response curves via in-context learning without task-specific fine-tuning.

STRABLE: Benchmarking Tabular Machine Learning with Strings

cs.LG · 2026-05-12 · unverdicted · novelty 8.0

A new corpus of 108 mixed string-numeric tables shows that advanced tabular learners with basic string embeddings perform well on most real-world data, while large LLM encoders help on free-text heavy tables.

Beyond IID: How General Are Tabular Foundation Models, Really?

cs.LG · 2026-06-29 · unverdicted · novelty 7.0

Tabular foundation models excel on tiny- to medium-sized IID data but are outperformed by traditional tree-based and deep learning models on non-IID, large, and high-dimensional datasets, based on evaluations across 11 models and 142 datasets in the new BeyondArena benchmark.

FlexTab: A Flexible Encoder-Decoder Architecture for In-Context Learning Across Diverse Tabular Tasks

cs.LG · 2026-06-29 · unverdicted · novelty 7.0 · 2 refs

FlexTab shows a shared encoder with task-specific decoders trained on unlabeled tables can achieve SOTA on classification, regression, anomaly detection and entity matching while staying competitive on relational entity classification.

Computational Identifiability

cs.LG · 2026-06-08 · unverdicted · novelty 7.0

The paper defines computational identifiability as success of a finite search procedure in finding an empirical estimator for a causal query within error tolerance, conditional on the search assumptions and procedure.

TS-ICL: A Flexible Time-Indexed Foundation Model for Time Series via In-Context Learning

cs.LG · 2026-06-04 · unverdicted · novelty 7.0

TS-ICL introduces a probabilistic in-context learning encoder-regressor Transformer that unifies forecasting and imputation for time series via timestamp-aligned regression trained on synthetic causal data.

Speedrunning Tabular Foundation Model Pretraining

cs.LG · 2026-06-02 · unverdicted · novelty 7.0

A speedrun benchmark for nanoTabPFN pretraining reports a record of 0.92 minutes to target performance, an 81x speedup over the 74.32-minute baseline using 22x fewer synthetic datasets.

TabPrep: Closing the Feature Engineering Gap in Tabular Benchmarks

cs.LG · 2026-06-01 · unverdicted · novelty 7.0

TabPrep is a new feature engineering pipeline that targets three data patterns and improves performance of tree-based, neural, linear, and foundation models on tabular benchmarks, often more than model architecture changes.

CalArena: A Large-Scale Post-Hoc Calibration Benchmark

cs.LG · 2026-05-28 · conditional · novelty 7.0

CalArena is a large-scale benchmark that evaluates dozens of post-hoc calibration methods using Post-Hoc Improvement (PHI) in proper scoring rules and finds that smooth functions outperform binning while dedicated multiclass methods are required in high-dimensional settings.

SurvivalPFN: Amortizing Survival Prediction via In-Context Bayesian Inference

cs.LG · 2026-05-15 · unverdicted · novelty 7.0

SurvivalPFN amortizes Bayesian survival analysis for right-censored data by pretraining a prior-data fitted network on synthetic identifiable DGPs and then performing in-context inference, achieving competitive results on 61 real datasets.

MulTaBench: Benchmarking Multimodal Tabular Learning with Text and Image

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

MulTaBench is a new collection of 40 image-tabular and text-tabular datasets designed to test target-aware representation tuning in multimodal tabular models.

PFN-TS: Thompson Sampling for Contextual Bandits via Prior-Data Fitted Networks

stat.ML · 2026-05-11 · unverdicted · novelty 7.0

PFN-TS converts PFN posterior predictives into mean-reward samples for Thompson sampling using a subsampled predictive CLT, with consistency proofs, regret bounds, and strong empirical performance on synthetic and real bandit benchmarks.

Data Language Models: A New Foundation Model Class for Tabular Data

cs.AI · 2026-05-07 · unverdicted · novelty 7.0

Schema-1 is the first Data Language Model that natively understands raw tabular data and outperforms gradient-boosted ensembles, AutoML, and prior tabular foundation models on row-level prediction and imputation tasks.

TFM-Retouche: A Lightweight Input-Space Adapter for Tabular Foundation Models

cs.LG · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

TFM-Retouche is an architecture-agnostic input-space residual adapter that improves tabular foundation model accuracy on 51 datasets by learning input corrections through the frozen backbone, with an identity guard to fall back to the original model.

RamanBench: A Large-Scale Benchmark for Machine Learning on Raman Spectroscopy

cs.LG · 2026-05-03 · unverdicted · novelty 7.0

RamanBench unifies 74 datasets into the first large-scale reproducible benchmark for ML on Raman spectra, finding tabular foundation models outperform baselines but no method generalizes across datasets.

TabPATE: Differentially Private Tabular In-Context Learning Without Public Data

cs.LG · 2026-06-30 · unverdicted · novelty 6.0

TabPATE applies a PATE-style private aggregation to synthetic tabular queries generated from feature ranges, enabling private in-context learning with near-random membership inference success while keeping competitive utility.

In-Context Learning for Latent Space Bayesian Optimization

cs.LG · 2026-06-08 · unverdicted · novelty 6.0

Complementing tabular foundation model pretraining with LSBO-specific synthetic tasks and a regularizer yields strong performance on held-out molecular optimization benchmarks.

LUCoS: Latent Unsupervised Context Selection for Tabular Foundation Models

cs.LG · 2026-05-26 · unverdicted · novelty 6.0

LUCoS replaces raw tabular geometry with unsupervised PFN latent embeddings for medoid-based context selection and ranks first on mean AUC, ACC, and F1 across 67 datasets and six budgets.

Shaping the Prior: How Synthetic Task Distributions Determine Tabular Foundation Model Quality

cs.LG · 2026-05-18 · unverdicted · novelty 6.0

O'Prior, a compositional synthetic prior with hierarchical SCMs, realism engines, stress modules, and curriculum protocols, improves tabular foundation model accuracy and robustness on real benchmarks when architecture and compute are held fixed.

Pocket Foundation Models: Distilling TFMs into CPU-Ready Gradient-Boosted Trees

cs.LG · 2026-05-18 · accept · novelty 6.0

Distilling TabICLv2 into XGBoost via stratified OOF labeling yields 0.882 macro-mean AUC (96.5% of teacher) at 1.9 ms CPU across 153 datasets, with significant gains over tuned CatBoost on low-dimensional data.

KGPFN: Unlocking the Potential of Knowledge Graph Foundation Model via In-Context Learning

cs.AI · 2026-05-14 · unverdicted · novelty 6.0

KGPFN pretrains on multiple KGs to learn relation patterns, then performs query-specific reasoning by encoding local context with NBFNet and global context via retrieved instances aggregated in a PFN with feature- and sample-level attention.

Online Sharp-Calibrated Bayesian Optimization

cs.LG · 2026-05-11 · unverdicted · novelty 6.0

OSCBO adaptively balances Gaussian process sharpness and calibration in Bayesian optimization by casting hyperparameter selection as constrained online learning, while preserving sublinear regret bounds.

In-Context Black-Box Optimization with Unreliable Feedback

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

FICBO pretrains a feedback-aware transformer with a structured prior on feedback distortion to adaptively exploit or ignore unreliable auxiliary signals during in-context black-box optimization.

Tabular foundation models for in-context prediction of molecular properties

cs.LG · 2026-04-17 · unverdicted · novelty 6.0

Tabular foundation models achieve high accuracy in molecular property prediction through in-context learning, with up to 100% win rates on MoleculeACE tasks when paired with CheMeleon embeddings.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.

TabICLv2: A better, faster, scalable, and open tabular foundation model.arXiv preprint arXiv:2602.11139

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer