LLMs correct only 34.8% of zero-shot annotation errors via prompting, and Definition-Specific Familiarity correlates positively with performance (partial r = +0.41) while memorization metrics do not.
On second thought, let ' s not think step by step! bias and toxicity in zero-shot reasoning
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
LLMs outperform humans in expressing illocutionary intents and sycophancy in successful persuasive counter-arguments from ChangeMyView, with crowd workers preferring LLM versions.
Reasoning in large output spaces proceeds via shortlisting then fine-grained reasoning; this characterization enables a mechanistic distillation strategy that outperforms standard distillation.
citing papers explorer
-
On the Limits of LLM Adaptability: Impact of Model-Internalized Priors on Annotation Task Performance
LLMs correct only 34.8% of zero-shot annotation errors via prompting, and Definition-Specific Familiarity correlates positively with performance (partial r = +0.41) while memorization metrics do not.
-
"I understand your perspective": LLM Persuasion and Sycophancy through the Lens of Communicative Action Theory
LLMs outperform humans in expressing illocutionary intents and sycophancy in successful persuasive counter-arguments from ChangeMyView, with crowd workers preferring LLM versions.
-
Characterize Then Distill: Mechanistic Reasoning in Large Output Spaces
Reasoning in large output spaces proceeds via shortlisting then fine-grained reasoning; this characterization enables a mechanistic distillation strategy that outperforms standard distillation.