Wesley Scivetti and Nathan Schneider

Are emergent abilities of large language models a mirage? InAdvances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems · 2023 · arXiv 2208.07998

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

Do Language Models Know What Not to Say? Causal Evidence for Statistical Preemption in LLMs

cs.CL · 2026-05-21 · unverdicted · novelty 8.0

LLMs show statistical preemption for 120 verb-construction pairs, with surprisal driven by competing-form frequency rather than verb frequency, scaling as a power law with size, and causally shifted by controlled fine-tuning.

Lil-Bevo: Explorations of Strategies for Training Language Models in More Humanlike Ways

cs.CL · 2023-10-26 · unverdicted · novelty 3.0

Lil-Bevo applies music pretraining, curriculum learning on sequence length, and targeted masking to small LMs in the BabyLM challenge, finding modest gains from short sequences but overall limited performance.

citing papers explorer

Showing 2 of 2 citing papers.

Do Language Models Know What Not to Say? Causal Evidence for Statistical Preemption in LLMs cs.CL · 2026-05-21 · unverdicted · none · ref 6
LLMs show statistical preemption for 120 verb-construction pairs, with surprisal driven by competing-form frequency rather than verb frequency, scaling as a power law with size, and causally shifted by controlled fine-tuning.
Lil-Bevo: Explorations of Strategies for Training Language Models in More Humanlike Ways cs.CL · 2023-10-26 · unverdicted · none · ref 33
Lil-Bevo applies music pretraining, curriculum learning on sequence length, and targeted masking to small LMs in the BabyLM challenge, finding modest gains from short sequences but overall limited performance.

Wesley Scivetti and Nathan Schneider

fields

years

verdicts

representative citing papers

citing papers explorer