What's the plan? Metrics for implicit planning in LLMs and their application to rhyme generation and question answering

Jim Maar , Denis Paperno , Callum Stuart McDougall , Neel Nanda

Authors on Pith no claims yet

classification 💻 cs.LG cs.AIcs.CL

keywords modelsplanningimplicitrhymegenerationlanguagequestiontoken

read the original abstract

Prior work suggests that language models, while trained on next token prediction, show implicit planning behavior: they may select the next token in preparation to a predicted future token, such as a likely rhyming word, as supported by a prior qualitative study of Claude 3.5 Haiku using a cross-layer transcoder. We propose much simpler techniques for assessing implicit planning in language models. With case studies on rhyme poetry generation and question answering, we demonstrate that our methodology easily scales to many models. Across models, we find that the generated rhyme (e.g. "-ight") or answer to a question ("whale") can be manipulated by steering at the end of the preceding line with a vector, affecting the generation of intermediate tokens leading up to the rhyme or answer word. We show that implicit planning is a universal mechanism, present in smaller models than previously thought, starting from 1B parameters. Our methodology offers a widely applicable direct way to study implicit planning abilities of LLMs. More broadly, understanding planning abilities of language models can inform decisions in AI safety and control.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Do Large Language Models Plan Answer Positions? Position Bias in Multiple-Choice Question Generation
cs.CL 2026-05 unverdicted novelty 6.0

LLMs implicitly plan answer positions during MCQ generation, as shown by predictive signals in hidden representations and controllable shifts via activation steering.