Preregistered placebo-controlled decomposition shows external executable counterexamples drive self-repair gains in small code models more than re-exposure or self-critique.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.SE 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Coding LLMs exhibit detrimental semantic collapse on underspecified prompts by producing consistent but incorrect code rather than incoherent variations, affecting 3-32% of tasks across MBPP, HumanEval, and LiveCodeBench.
citing papers explorer
-
Falsification, Not Exposure: An Internally Preregistered Placebo-Controlled Decomposition of Self-Repair Feedback in Frozen Small Code Models
Preregistered placebo-controlled decomposition shows external executable counterexamples drive self-repair gains in small code models more than re-exposure or self-critique.
-
Underspecification does not imply Incoherence: The Risks of Semantic Collapse in Coding Models
Coding LLMs exhibit detrimental semantic collapse on underspecified prompts by producing consistent but incorrect code rather than incoherent variations, affecting 3-32% of tasks across MBPP, HumanEval, and LiveCodeBench.