A new corpus of 108 mixed string-numeric tables shows that advanced tabular learners with basic string embeddings perform well on most real-world data, while large LLM encoders help on free-text heavy tables.
A new measure of rank correlation.Biometrika, 30(1-2):81–93
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
SGR-Bench evaluates agentic LLM systems on state-gated retrieval tasks where evidence is only accessible after configuring site-specific states, with the strongest system reaching 66.18% item-level F1 and failures dominated by retrieval-scope drift.
Autoregressive semantic ID generation creates tree-induced probability correlations that prevent generative recommenders from capturing simple patterns; Latte adds latent tokens to relax these correlations.
TipPFN uses prior-data fitted networks and in-context learning on synthetic bifurcation data to detect proximity to critical transitions in unseen dynamical systems and real observations.
ModelLens learns a performance-aware latent space from 1.62M leaderboard records to rank unseen models on unseen datasets without forward passes on the target.
A multi-stage Delphi consensus with 92 experts catalogs widespread validation pitfalls in surgical AI video analysis across data, metrics, and reporting, supported by a systematic review and empirical experiments.
citing papers explorer
-
STRABLE: Benchmarking Tabular Machine Learning with Strings
A new corpus of 108 mixed string-numeric tables shows that advanced tabular learners with basic string embeddings perform well on most real-world data, while large LLM encoders help on free-text heavy tables.
-
SGR-Bench: Benchmarking Search Agents on State-Gated Retrieval
SGR-Bench evaluates agentic LLM systems on state-gated retrieval tasks where evidence is only accessible after configuring site-specific states, with the strongest system reaching 66.18% item-level F1 and failures dominated by retrieval-scope drift.
-
Expressiveness Limits of Autoregressive Semantic ID Generation in Generative Recommendation
Autoregressive semantic ID generation creates tree-induced probability correlations that prevent generative recommenders from capturing simple patterns; Latte adds latent tokens to relax these correlations.
-
In-context learning to predict critical transitions in dynamical systems
TipPFN uses prior-data fitted networks and in-context learning on synthetic bifurcation data to detect proximity to critical transitions in unseen dynamical systems and real observations.
-
ModelLens: Finding the Best for Your Task from Myriads of Models
ModelLens learns a performance-aware latent space from 1.62M leaderboard records to rank unseen models on unseen datasets without forward passes on the target.
-
Current validation practice undermines surgical AI development
A multi-stage Delphi consensus with 92 experts catalogs widespread validation pitfalls in surgical AI video analysis across data, metrics, and reporting, supported by a systematic review and empirical experiments.