Persistent 'Rock Tokens' in on-policy distillation resist teacher corrections, consume large gradient norms, yet add negligible value to reasoning, allowing targeted bypassing to streamline alignment.
From Generation to Judgment: Opportunities and Challenges of LLM -as-a-judge
7 Pith papers cite this work, alongside 44 external citations. Polarity classification is still indexing.
years
2026 7representative citing papers
Item response theory applied to 17 LLMs on SciEntsBank and Beetle reveals that models with similar overall scores differ sharply in robustness to difficult responses, with errors clustering on partial-credit labels.
LaaB improves LLM hallucination detection by mapping self-judgment labels back into neural feature space and using mutual learning under logical consistency constraints between responses and meta-judgments.
ARA extracts workflow graphs from papers and scores reproducibility, reaching 61% accuracy on 213 ReScience C articles and outperforming priors on ReproBench and GoldStandardDB.
Both humans and LLMs trust content more when labeled human-authored than AI-generated, with LLMs showing denser attention to labels and higher uncertainty under AI labels, mirroring human heuristic patterns.
LLM assistance shortens idea-generation periods and reduces creative moments during programming tasks while yielding solutions with comparable idea counts and greater functional correctness.
A reasoning-distillation plus dual-reward GRPO method for multi-role dialogue summarization matches ROUGE and BERTScore baselines while improving factual faithfulness and preference alignment on CSDS and SAMSum.
citing papers explorer
No citing papers match the current filters.