LLMs perform in-context learning as trajectories through a structured low-dimensional conceptual belief space, with the structure visible in both behavior and internal representations and causally manipulable via interventions.
Core Francisco Park, Ekdeep Singh Lubana, Itamar Pres, and Hidenori Tanaka
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
representative citing papers
Derives closed-form optimal attention temperature minimizing ICL generalization error under distribution shift, linked to pre-softmax score moments, with LLM validation.