RuDE predicts post-training performance of base LLMs with over 90% correlation by using response discrimination on rubric-violating contrastive pairs, validated by RL to identify high-potential smaller models.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
On Predicting the Post-training Potential of Pre-trained LLMs
RuDE predicts post-training performance of base LLMs with over 90% correlation by using response discrimination on rubric-violating contrastive pairs, validated by RL to identify high-potential smaller models.