Faithful reasoning using large language models

Antonia Creswell, Murray Shanahan · 2022 · arXiv 2208.14271

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

representative citing papers

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

cs.CL · 2023-05-17 · accept · novelty 8.0

Tree of Thoughts enables language models to solve complex planning tasks by generating, evaluating, and searching over coherent intermediate thoughts in a tree, raising Game of 24 success from 4% to 74% with GPT-4.

Visual Perceptual to Conceptual First-Order Rule Learning Networks

cs.AI · 2026-04-09 · unverdicted · novelty 7.0

γILP is a differentiable pipeline for inducing first-order rules from unlabeled image data, showing strong performance on symbolic relational datasets, relational images, and pure image datasets such as Kandinsky patterns.

Measuring Faithfulness in Chain-of-Thought Reasoning

cs.AI · 2023-07-17 · conditional · novelty 7.0

Chain-of-Thought reasoning in LLMs is often unfaithful, with models relying on it variably by task and less so as models scale larger.

When Reasoning Traces Become Performative: Step-Level Evidence that Chain-of-Thought Is an Imperfect Oversight Channel

cs.AI · 2026-05-12 · unverdicted · novelty 6.0

CoT traces align with internal answer commitment in only 61.9% of steps on average, dominated by confabulated continuations after commitment has stabilized.

LLM Reasoning Is Latent, Not the Chain of Thought

cs.AI · 2026-04-17 · unverdicted · novelty 5.0

LLM reasoning is primarily mediated by latent-state trajectories rather than by explicit surface chain-of-thought outputs.

Learning to Draw ASCII Improves Spatial Reasoning in Language Models

cs.AI · 2026-04-16 · unverdicted · novelty 5.0

Training LLMs on text-to-ASCII spatial layout construction improves text-only spatial reasoning and transfers to external benchmarks.

citing papers explorer

Showing 5 of 5 citing papers after filters.

Visual Perceptual to Conceptual First-Order Rule Learning Networks cs.AI · 2026-04-09 · unverdicted · none · ref 4
γILP is a differentiable pipeline for inducing first-order rules from unlabeled image data, showing strong performance on symbolic relational datasets, relational images, and pure image datasets such as Kandinsky patterns.
Measuring Faithfulness in Chain-of-Thought Reasoning cs.AI · 2023-07-17 · conditional · none · ref 6
Chain-of-Thought reasoning in LLMs is often unfaithful, with models relying on it variably by task and less so as models scale larger.
When Reasoning Traces Become Performative: Step-Level Evidence that Chain-of-Thought Is an Imperfect Oversight Channel cs.AI · 2026-05-12 · unverdicted · none · ref 10
CoT traces align with internal answer commitment in only 61.9% of steps on average, dominated by confabulated continuations after commitment has stabilized.
LLM Reasoning Is Latent, Not the Chain of Thought cs.AI · 2026-04-17 · unverdicted · none · ref 11
LLM reasoning is primarily mediated by latent-state trajectories rather than by explicit surface chain-of-thought outputs.
Learning to Draw ASCII Improves Spatial Reasoning in Language Models cs.AI · 2026-04-16 · unverdicted · none · ref 1
Training LLMs on text-to-ASCII spatial layout construction improves text-only spatial reasoning and transfers to external benchmarks.

Faithful reasoning using large language models

fields

years

verdicts

representative citing papers

citing papers explorer