HERO'S JOURNEY benchmark evaluates LLMs on attribute and procedural rule induction across four structural forms, finding limited uneven performance with execution as the main bottleneck and steering helping only attribute tasks.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
LLMs perform in-context learning as trajectories through a structured low-dimensional conceptual belief space, with the structure visible in both behavior and internal representations and causally manipulable via interventions.
citing papers explorer
-
HERO'S JOURNEY: Testing Complex Rule Induction with Text Games
HERO'S JOURNEY benchmark evaluates LLMs on attribute and procedural rule induction across four structural forms, finding limited uneven performance with execution as the main bottleneck and steering helping only attribute tasks.
-
Stories in Space: In-Context Learning Trajectories in Conceptual Belief Space
LLMs perform in-context learning as trajectories through a structured low-dimensional conceptual belief space, with the structure visible in both behavior and internal representations and causally manipulable via interventions.