Pith Number

pith:P2AECYT4

pith:2025:P2AECYT4XDXVW6XLRIORKIDQPS

not attested not anchored not stored refs resolved

TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models

Adrian Hayler, Alessandro Bonetto, Anurag Garg, Benjamin J\"ager, Bernhard Sch\"olkopf, Brendan Roof, Clara Cornu, Dominik Safaric, Felix Birkel, Felix Jablonski, Frank Hutter, Jake Robertson, Klemens Fl\"oge, Lennart Purucker, L\'eo Grinsztajn, Lilly Charlotte Wehrhahn, Magnus B\"uhler, Mihir Manium, Noah Hollmann, Oscar Key, Philipp Jund, Rosen Yu, Sauraj Gambhir, Shi Bin Hoo, Simone Alessi, Vladyslav Moroshan

TabPFN-2.5 scales tabular foundation models to 20 times more data cells and leads the TabArena benchmark.

arxiv:2511.08667 v2 · 2025-11-11 · cs.LG · stat.ML

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{P2AECYT4XDXVW6XLRIORKIDQPS}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

default TabPFN-2.5 has a 100% win rate against default XGBoost on small to medium-sized classification datasets (<=10,000 data points, 500 features) and a 87% win rate on larger datasets up to 100K samples and 2K features (85% for regression).

C2weakest assumption

That the reported win rates and benchmark leadership on TabArena will generalize to new, unseen datasets outside the benchmark collection and that the training procedure does not contain undisclosed hyperparameter advantages.

C3one line summary

TabPFN-2.5 scales tabular foundation models to 20x larger datasets, outperforms tuned tree models on TabArena, achieves near-perfect win rates against default XGBoost, and adds a distillation engine for fast production deployment.

References

250 extracted · 250 resolved · 7 Pith anchors

[1] arXiv:2506.16791 [cs] 2025

[2] Xgboost: A scalable tree boosting system 2016

[3] Catboost: unbiased boosting with categorical features.Advances in neural information processing systems, 31, 2018 2018

[4] Lightgbm: A highly efficient gradient boosting decision tree 2017

[5] Applying constraint satisfaction techniques to job shop scheduling , journal = 2001 · doi:10.1023/a

Formal links

1 machine-checked theorem link

Cited by

35 papers in Pith

Proxy-Based Approximation of Shapley and Banzhaf Interactions

Correcting Class Imbalance in Prior-Data Fitted Networks for Tabular Classification

Proxy-Based Approximation of Shapley and Banzhaf Interactions

Tabular foundation models for robust calibration of near-infrared chemical sensing data

FEAT: A Linear-Complexity Foundation Model for Extremely Large Structured Data

Receipt and verification

First computed	2026-05-17T23:38:53.573895Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

7e8041627cb8ef5b7aeb8a1d1520707ca72c006a8dd784ad25a232dea2aa2ea1

Aliases

arxiv: 2511.08667 · arxiv_version: 2511.08667v2 · doi: 10.48550/arxiv.2511.08667 · pith_short_12: P2AECYT4XDXV · pith_short_16: P2AECYT4XDXVW6XL · pith_short_8: P2AECYT4

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/P2AECYT4XDXVW6XLRIORKIDQPS \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 7e8041627cb8ef5b7aeb8a1d1520707ca72c006a8dd784ad25a232dea2aa2ea1

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "34d2eb8cbaa7899adab58fe896c619c419da6ad5a95e700f54d3bf64098685d6",
    "cross_cats_sorted": [
      "stat.ML"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2025-11-11T18:57:15Z",
    "title_canon_sha256": "3237b357c6bb4bfa8fd6e8fa85eaf5e2d5d077dacfe3e658bce83dcd9adcc2b2"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2511.08667",
    "kind": "arxiv",
    "version": 2
  }
}