xRFM: Accurate, scalable, and interpretable feature learning models for tabular data

Daniel Beaglehole , David Holzm\"uller , Adityanarayanan Radhakrishnan , Mikhail Belkin

Authors on Pith no claims yet

classification 💻 cs.LG stat.ML

keywords datamethodstabularxrfmbestfeaturegbdtslearning

read the original abstract

Inference from tabular data, collections of continuous and categorical variables organized into matrices, is a foundation for modern technology and science. Yet, in contrast to the explosive changes in the rest of AI, the best practice for these predictive tasks has been relatively unchanged and is still primarily based on variations of Gradient Boosted Decision Trees (GBDTs). Very recently, there has been renewed interest in developing state-of-the-art methods for tabular data based on recent developments in neural networks and feature learning methods. In this work, we introduce xRFM, an algorithm that combines feature learning kernel machines with a tree structure to both adapt to the local structure of the data and scale to essentially unlimited amounts of training data. We show that compared to $31$ other methods, including recently introduced tabular foundation models (TabPFNv2) and GBDTs, xRFM achieves best performance across $100$ regression datasets and is competitive to the best methods across $200$ classification datasets outperforming GBDTs. Additionally, xRFM provides interpretability natively through the Average Gradient Outer Product.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

AGOP as Explanation: From Feature Learning to Per-Sample Attribution in Image Classifiers
cs.LG 2026-05 conditional novelty 7.0

AGOP-based attribution methods outperform Integrated Gradients and other baselines on pixel-level ground truth benchmarks for explaining image classifier decisions, with AGOP-Global offering zero inference cost.
TabPFN-3: Technical Report
cs.LG 2026-05 unverdicted novelty 6.0

TabPFN-3 delivers state-of-the-art tabular prediction performance on benchmarks up to 1M rows, is up to 20x faster than prior versions, and introduces test-time scaling that beats non-TabPFN models by hundreds of Elo points.
TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models
cs.LG 2025-11 unverdicted novelty 6.0

TabPFN-2.5 scales tabular foundation models to 20x larger datasets, outperforms tuned tree models on TabArena, achieves near-perfect win rates against default XGBoost, and adds a distillation engine for fast productio...