Title resolution pending

URLhttps://arxiv · 2023 · arXiv 2312.03121

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

baseline 1

citation-polarity summary

baseline 1

representative citing papers

How Hard is it to Rig a Benchmark? A Social Choice Analysis of Leaderboard Robustness

cs.LG · 2026-05-22 · unverdicted · novelty 7.0

Benchmark-specific training maps to shift bribery and is NP-hard under Borda and mean win rate; mean win rate has the highest instance-level robustness (median 22 tasks on BBH) among tested aggregation rules.

Nash without Numbers: A Social Choice Approach to Mixed Equilibria in Context-Ordinal Games

cs.GT · 2026-05-08 · unverdicted · novelty 7.0

Context-ordinal Nash equilibria are defined via social choice aggregation of ordinal preferences, shown to exist under mild conditions, with regularization, approximation, regret notions, complexity results, and learning rules developed.

Who Defines "Best"? Towards Interactive, User-Defined Evaluation of LLM Leaderboards

cs.AI · 2026-04-23 · unverdicted · novelty 6.0

Analysis of the LMArena dataset reveals heavy topic skew and varying model rankings, leading to an interactive visualization tool for users to define custom evaluation priorities on LLM leaderboards.

citing papers explorer

Showing 3 of 3 citing papers.

How Hard is it to Rig a Benchmark? A Social Choice Analysis of Leaderboard Robustness cs.LG · 2026-05-22 · unverdicted · none · ref 17
Benchmark-specific training maps to shift bribery and is NP-hard under Borda and mean win rate; mean win rate has the highest instance-level robustness (median 22 tasks on BBH) among tested aggregation rules.
Nash without Numbers: A Social Choice Approach to Mixed Equilibria in Context-Ordinal Games cs.GT · 2026-05-08 · unverdicted · none · ref 41
Context-ordinal Nash equilibria are defined via social choice aggregation of ordinal preferences, shown to exist under mild conditions, with regularization, approximation, regret notions, complexity results, and learning rules developed.
Who Defines "Best"? Towards Interactive, User-Defined Evaluation of LLM Leaderboards cs.AI · 2026-04-23 · unverdicted · none · ref 30
Analysis of the LMArena dataset reveals heavy topic skew and varying model rankings, leading to an interactive visualization tool for users to define custom evaluation priorities on LLM leaderboards.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer