To do this, we expand the loss to first order inβandδaround the unigram solution, L(β, δ) =L 1-Gen −c ββ−c δδ+

Acquisition of the 2-Gen solution We now examine the kinetics of acquisition of the 2-Gen solutionwC = 1, β→ ∞, δ→ ∞

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Distinct mechanisms underlying in-context learning in transformers

cs.LG · 2026-04-14 · unverdicted · novelty 6.0

Transformers develop four algorithmic phases of in-context learning on Markov chains via two distinct multi-layer subcircuit mechanisms, with phase boundaries set by data diversity K.

citing papers explorer

Showing 1 of 1 citing paper.

Distinct mechanisms underlying in-context learning in transformers cs.LG · 2026-04-14 · unverdicted · none · ref 22
Transformers develop four algorithmic phases of in-context learning on Markov chains via two distinct multi-layer subcircuit mechanisms, with phase boundaries set by data diversity K.

To do this, we expand the loss to first order inβandδaround the unigram solution, L(β, δ) =L 1-Gen −c ββ−c δδ+

fields

years

verdicts

representative citing papers

citing papers explorer