pith. sign in

hub Mixed citations

TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models

Mixed citation behavior. Most common role is background (56%).

49 Pith papers citing it
Background 56% of classified citations
abstract

The first tabular foundation model, TabPFN, and its successor TabPFNv2 have impacted tabular AI substantially, with dozens of methods building on it and hundreds of applications across different use cases. This report introduces TabPFN-2.5, the next generation of our tabular foundation model, built for datasets with up to 50,000 data points and 2,000 features, a 20x increase in data cells compared to TabPFNv2. TabPFN-2.5 is now the leading method for the industry standard benchmark TabArena (which contains datasets with up to 100,000 training data points), substantially outperforming tuned tree-based models and matching the accuracy of AutoGluon 1.4, a complex four-hour tuned ensemble that even includes the previous TabPFNv2. Remarkably, default TabPFN-2.5 has a 100% win rate against default XGBoost on small to medium-sized classification datasets (<=10,000 data points, 500 features) and a 87% win rate on larger datasets up to 100K samples and 2K features (85% for regression). For production use cases, we introduce a new distillation engine that converts TabPFN-2.5 into a compact MLP or tree ensemble, preserving most of its accuracy while delivering orders-of-magnitude lower latency and plug-and-play deployment. This new release will immediately strengthen the performance of the many applications and methods already built on the TabPFN ecosystem.

hub tools

citation-role summary

background 5 baseline 3 other 1

citation-polarity summary

years

2026 49

representative citing papers

Causal Foundation Models with Continuous Treatments

cs.LG · 2026-05-14 · unverdicted · novelty 8.0

A transformer foundation model is trained on synthetic data from a novel prior over continuous-treatment data-generating processes to predict treatment-response curves via in-context learning without task-specific fine-tuning.

STRABLE: Benchmarking Tabular Machine Learning with Strings

cs.LG · 2026-05-12 · unverdicted · novelty 8.0

A new corpus of 108 mixed string-numeric tables shows that advanced tabular learners with basic string embeddings perform well on most real-world data, while large LLM encoders help on free-text heavy tables.

Probing Memorization of Tabular In-Context Learning

cs.LG · 2026-06-30 · unverdicted · novelty 7.0

A new probing framework detects moderate parametric memorization signals in tabular in-context learning models under single-task fine-tuning, strongest on low-cardinality tasks, but signals largely disappear under realistic training.

Beyond IID: How General Are Tabular Foundation Models, Really?

cs.LG · 2026-06-29 · unverdicted · novelty 7.0

Tabular foundation models excel on tiny- to medium-sized IID data but are outperformed by traditional tree-based and deep learning models on non-IID, large, and high-dimensional datasets, based on evaluations across 11 models and 142 datasets in the new BeyondArena benchmark.

CalArena: A Large-Scale Post-Hoc Calibration Benchmark

cs.LG · 2026-05-28 · conditional · novelty 7.0

CalArena is a large-scale benchmark that evaluates dozens of post-hoc calibration methods using Post-Hoc Improvement (PHI) in proper scoring rules and finds that smooth functions outperform binning while dedicated multiclass methods are required in high-dimensional settings.

Rethinking Weak Supervision in Anomaly Detection: A Comprehensive Benchmark

cs.LG · 2026-05-25 · accept · novelty 7.0

WSADBench unifies WSAD evaluation across three supervision types, runs 700K experiments on 36 algorithms and 4 modalities, and finds strong correlations between scenarios plus performance boundaries favoring general models except in extreme label scarcity.

Data Language Models: A New Foundation Model Class for Tabular Data

cs.AI · 2026-05-07 · unverdicted · novelty 7.0

Schema-1 is the first Data Language Model that natively understands raw tabular data and outperforms gradient-boosted ensembles, AutoML, and prior tabular foundation models on row-level prediction and imputation tasks.

TFM-Retouche: A Lightweight Input-Space Adapter for Tabular Foundation Models

cs.LG · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

TFM-Retouche is an architecture-agnostic input-space residual adapter that improves tabular foundation model accuracy on 51 datasets by learning input corrections through the frozen backbone, with an identity guard to fall back to the original model.

Proxy-Based Approximation of Shapley and Banzhaf Interactions

cs.LG · 2026-05-21 · unverdicted · novelty 6.0 · 2 refs

ProxySHAP approximates higher-order Shapley and Banzhaf interactions via tree proxies plus residual correction and a polynomial-time interventional TreeSHAP generalization for tree ensembles.

citing papers explorer

Showing 49 of 49 citing papers.