pith. sign in
Pith Number

pith:P2AECYT4

pith:2025:P2AECYT4XDXVW6XLRIORKIDQPS
not attested not anchored not stored refs resolved

TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models

Adrian Hayler, Alessandro Bonetto, Anurag Garg, Benjamin J\"ager, Bernhard Sch\"olkopf, Brendan Roof, Clara Cornu, Dominik Safaric, Felix Birkel, Felix Jablonski, Frank Hutter, Jake Robertson, Klemens Fl\"oge, Lennart Purucker, L\'eo Grinsztajn, Lilly Charlotte Wehrhahn, Magnus B\"uhler, Mihir Manium, Noah Hollmann, Oscar Key, Philipp Jund, Rosen Yu, Sauraj Gambhir, Shi Bin Hoo, Simone Alessi, Vladyslav Moroshan

TabPFN-2.5 scales tabular foundation models to 20 times more data cells and leads the TabArena benchmark.

arxiv:2511.08667 v2 · 2025-11-11 · cs.LG · stat.ML

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{P2AECYT4XDXVW6XLRIORKIDQPS}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

default TabPFN-2.5 has a 100% win rate against default XGBoost on small to medium-sized classification datasets (<=10,000 data points, 500 features) and a 87% win rate on larger datasets up to 100K samples and 2K features (85% for regression).

C2weakest assumption

That the reported win rates and benchmark leadership on TabArena will generalize to new, unseen datasets outside the benchmark collection and that the training procedure does not contain undisclosed hyperparameter advantages.

C3one line summary

TabPFN-2.5 scales tabular foundation models to 20x larger datasets, outperforms tuned tree models on TabArena, achieves near-perfect win rates against default XGBoost, and adds a distillation engine for fast production deployment.

References

250 extracted · 250 resolved · 7 Pith anchors

[1] arXiv:2506.16791 [cs] 2025
[2] Xgboost: A scalable tree boosting system 2016
[3] Catboost: unbiased boosting with categorical features.Advances in neural information processing systems, 31, 2018 2018
[4] Lightgbm: A highly efficient gradient boosting decision tree 2017
[5] Applying constraint satisfaction techniques to job shop scheduling , journal = 2001 · doi:10.1023/a

Formal links

1 machine-checked theorem link

Cited by

35 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:53.573895Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

7e8041627cb8ef5b7aeb8a1d1520707ca72c006a8dd784ad25a232dea2aa2ea1

Aliases

arxiv: 2511.08667 · arxiv_version: 2511.08667v2 · doi: 10.48550/arxiv.2511.08667 · pith_short_12: P2AECYT4XDXV · pith_short_16: P2AECYT4XDXVW6XL · pith_short_8: P2AECYT4
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/P2AECYT4XDXVW6XLRIORKIDQPS \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 7e8041627cb8ef5b7aeb8a1d1520707ca72c006a8dd784ad25a232dea2aa2ea1
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "34d2eb8cbaa7899adab58fe896c619c419da6ad5a95e700f54d3bf64098685d6",
    "cross_cats_sorted": [
      "stat.ML"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2025-11-11T18:57:15Z",
    "title_canon_sha256": "3237b357c6bb4bfa8fd6e8fa85eaf5e2d5d077dacfe3e658bce83dcd9adcc2b2"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2511.08667",
    "kind": "arxiv",
    "version": 2
  }
}