Zero-Run auditing supplies valid lower bounds on differential privacy parameters from fixed member and non-member datasets by modeling and correcting distribution-shift confounding via causal-inference techniques.
hub Canonical reference
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second
Canonical reference. 88% of citing Pith papers cite this work as background.
abstract
We present TabPFN, a trained Transformer that can do supervised classification for small tabular datasets in less than a second, needs no hyperparameter tuning and is competitive with state-of-the-art classification methods. TabPFN performs in-context learning (ICL), it learns to make predictions using sequences of labeled examples (x, f(x)) given in the input, without requiring further parameter updates. TabPFN is fully entailed in the weights of our network, which accepts training and test samples as a set-valued input and yields predictions for the entire test set in a single forward pass. TabPFN is a Prior-Data Fitted Network (PFN) and is trained offline once, to approximate Bayesian inference on synthetic datasets drawn from our prior. This prior incorporates ideas from causal reasoning: It entails a large space of structural causal models with a preference for simple structures. On the 18 datasets in the OpenML-CC18 suite that contain up to 1 000 training data points, up to 100 purely numerical features without missing values, and up to 10 classes, we show that our method clearly outperforms boosted trees and performs on par with complex state-of-the-art AutoML systems with up to 230$\times$ speedup. This increases to a 5 700$\times$ speedup when using a GPU. We also validate these results on an additional 67 small numerical datasets from OpenML. We provide all our code, the trained TabPFN, an interactive browser demo and a Colab notebook at https://github.com/automl/TabPFN.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Transformers performing in-context learning implicitly implement gradient descent, ridge regression, and least-squares predictors for linear models, with behavior shifting based on model depth, width, and data noise.
α-PFN trains two PFNs in sequence to predict expected information gain for entropy search, delivering over 50x speedups while remaining competitive on synthetic and real-world benchmarks.
Face-Feature Tuning is a label-free logit remapping method that reduces FPR/TPR gaps across groups in deepfake detection while preserving overall accuracy.
TabPrep is a new feature engineering pipeline that targets three data patterns and improves performance of tree-based, neural, linear, and foundation models on tabular benchmarks, often more than model architecture changes.
TabQL is a reinforcement learning framework that substitutes a tabular foundation model with in-context capabilities for the parametric Q-network in DQN, with a warm-up phase and theoretical analysis claiming improved sample efficiency.
SCAgent automates side-channel leakage discovery via LLM agents for target identification and few-shot foundation models for scalable analysis on iOS.
SurvivalPFN amortizes Bayesian survival analysis for right-censored data by pretraining a prior-data fitted network on synthetic identifiable DGPs and then performing in-context inference, achieving competitive results on 61 real datasets.
FORGE reformulates molecular optimization as context-aware fragment ranking and replacement using mined low-to-high edit pairs, outperforming larger language models and graph methods on standard benchmarks.
Forecast loss differentials are reframed as returns and assessed with risk-adjusted finance metrics, showing professional forecasters are harder to beat on risk-adjusted performance than on raw accuracy in US macro forecasting.
Schema-1 is the first Data Language Model that natively understands raw tabular data and outperforms gradient-boosted ensembles, AutoML, and prior tabular foundation models on row-level prediction and imputation tasks.
TFM-Retouche is an architecture-agnostic input-space residual adapter that improves tabular foundation model accuracy on 51 datasets by learning input corrections through the frozen backbone, with an identity guard to fall back to the original model.
PHBench shows Product Hunt launch signals predict Series A funding with an ensemble model reaching AP 0.037 and F0.5 0.097 on blind test data, outperforming logistic regression and zero-shot LLMs.
Time series foundation models match the performance of specialized models for day-ahead load forecasting while providing explanations that match domain knowledge on weather and calendar effects.
TabDistill distills feature interactions from tabular foundation models via post-hoc attribution and inserts them into GAMs, yielding consistent predictive gains.
The authors release the first Slovene ESG sentiment dataset from news and report that large language models lead on environmental and social classification while fine-tuned SloBERTa performs best on governance.
Reasoning LLMs with minimal tools for tree construction and analysis induce decision trees that outperform CART, compete with ensembles on low-resource tabular data, and provide human-readable reasoning traces.
Compares foundation models for probabilistic low-voltage load forecasting on 200 real feeders and introduces a grid-planning metric that scores peak prediction by its effect on asset cost-risk decisions.
STOIC integrates STGNN point forecasting with tabular foundation model in-context learning for conformal prediction to quantify uncertainty in graph-structured energy time series.
Trio proposes Temporal-Spatial-Sample attention and a TS-SCM synthetic data generator to improve multivariate time-series forecasting by reusing historical patterns and structural priors.
DFPL introduces prototype-based disentanglement and alignment modules to preserve fine-grained consistency across heterogeneous modalities for better performance under missing data conditions.
LimiX-2M outperforms larger TabPFN-v2 and TabICL models on tabular benchmarks by expanding scalars into RBF features and using a reordered S->N->F attention block.
LLMTabBench evaluates LLMs on zero- and few-shot binary tabular classification and reports that zero-shot can outperform few-shot due to example conflicts with model priors while performance drops beyond a complexity threshold.
FLUXtrapolation is a benchmark for domain generalization in ecosystem flux upscaling using temporal, spatial, and temperature-based extrapolation scenarios, with pilot results showing model separation on tail and multi-scale metrics.
citing papers explorer
-
From Uniform to Learned Knots: A Study of Spline-Based Numerical Encodings for Tabular Deep Learning
Spline encodings for numerical features show task-dependent performance in tabular deep learning, with piecewise-linear encoding robust for classification and variable results for regression depending on spline family, knot strategy, and backbone.