During pretraining, language models exhibit natural ungrokking where learned rules are forgotten based on their support frequency in the corpus, with asymmetric editability of rule survival.
author Garrette, D
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
A framework using language models to simulate non-existent experiments and derive novel testable hypotheses on dative verb acquisition and cross-structural generalization in children.
Emergent abilities are capabilities present in large language models but absent in smaller ones and cannot be predicted by extrapolating smaller model performance.
citing papers explorer
-
Natural Ungrokking: Asymmetric Control of Which Rules Survive Pretraining
During pretraining, language models exhibit natural ungrokking where learned rules are forgotten based on their support frequency in the corpus, with asymmetric editability of rule survival.
-
A systematic framework for generating novel experimental hypotheses from language models
A framework using language models to simulate non-existent experiments and derive novel testable hypotheses on dative verb acquisition and cross-structural generalization in children.
-
Emergent Abilities of Large Language Models
Emergent abilities are capabilities present in large language models but absent in smaller ones and cannot be predicted by extrapolating smaller model performance.