LLMs show statistical preemption for 120 verb-construction pairs, with surprisal driven by competing-form frequency rather than verb frequency, scaling as a power law with size, and causally shifted by controlled fine-tuning.
Wesley Scivetti and Nathan Schneider
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2verdicts
UNVERDICTED 2representative citing papers
Lil-Bevo applies music pretraining, curriculum learning on sequence length, and targeted masking to small LMs in the BabyLM challenge, finding modest gains from short sequences but overall limited performance.
citing papers explorer
-
Do Language Models Know What Not to Say? Causal Evidence for Statistical Preemption in LLMs
LLMs show statistical preemption for 120 verb-construction pairs, with surprisal driven by competing-form frequency rather than verb frequency, scaling as a power law with size, and causally shifted by controlled fine-tuning.
-
Lil-Bevo: Explorations of Strategies for Training Language Models in More Humanlike Ways
Lil-Bevo applies music pretraining, curriculum learning on sequence length, and targeted masking to small LMs in the BabyLM challenge, finding modest gains from short sequences but overall limited performance.