LLM popularity judgments align more closely with pretraining data exposure counts than with Wikipedia popularity, with stronger effects in pairwise comparisons and larger models.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
A structural reduction to local medians enables new constant-round MPC (2-α) and (2-ζ) approximation algorithms for 1-median rank aggregation under multiple distances, plus an improved 1.968-approximation for Ulam that extends to weighted cases.
Neuron-level inference-time intervention reduces multiple biases in reward models, enabling 2B and 7B models to match 70B performance on LLM alignment benchmarks without trade-offs.
citing papers explorer
-
Pretraining Exposure Explains Popularity Judgments in Large Language Models
LLM popularity judgments align more closely with pretraining data exposure counts than with Wikipedia popularity, with stronger effects in pairwise comparisons and larger models.
-
A Scalable and Unified Framework to Weighted Rank Aggregation
A structural reduction to local medians enables new constant-round MPC (2-α) and (2-ζ) approximation algorithms for 1-median rank aggregation under multiple distances, plus an improved 1.968-approximation for Ulam that extends to weighted cases.
-
Debiasing Reward Models via Causally Motivated Inference-Time Intervention
Neuron-level inference-time intervention reduces multiple biases in reward models, enabling 2B and 7B models to match 70B performance on LLM alignment benchmarks without trade-offs.