pith. sign in

Model Consistency as a Cheap yet Predictive Proxy for LLM Elo Scores

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.AI 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

PACE: A Proxy for Agentic Capability Evaluation

cs.AI · 2026-07-02 · unverdicted · novelty 6.0

PACE builds proxy benchmarks from non-agentic instances via relevance and global selection plus regression to predict agentic scores with MAE under 4%, Spearman correlation above 0.80, and 85% ranking accuracy at under 1% cost.

citing papers explorer

Showing 1 of 1 citing paper.

  • PACE: A Proxy for Agentic Capability Evaluation cs.AI · 2026-07-02 · unverdicted · none · ref 40

    PACE builds proxy benchmarks from non-agentic instances via relevance and global selection plus regression to predict agentic scores with MAE under 4%, Spearman correlation above 0.80, and 85% ranking accuracy at under 1% cost.