Chain of thought empowers transformers to solve inherently serial problems

Zhiyuan Li, Hong Liu, Denny Zhou, Tengyu Ma · 2024 · arXiv 2402.12875

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

representative citing papers

Training Transformers as a Universal Computer

cs.AI · 2026-04-28 · unverdicted · novelty 7.0

A transformer trained on random meaningless MicroPy programs generalizes to execute diverse human-written programs, providing empirical evidence it can act as a universal computer.

Internalized Reasoning for Long-Context Visual Document Understanding

cs.CV · 2026-03-31 · unverdicted · novelty 7.0

A synthetic pipeline creates and internalizes reasoning traces in VLMs for long-context visual document understanding, with a 32B model surpassing a 235B model on MMLongBenchDoc and showing 12.4x fewer output tokens.

Training Large Language Models to Reason in a Continuous Latent Space

cs.CL · 2024-12-09 · unverdicted · novelty 7.0

Coconut lets LLMs perform reasoning directly in continuous latent space by recycling hidden states as inputs, outperforming standard chain-of-thought on search-intensive logical tasks with better accuracy-efficiency trade-offs.

The Power of Power Law: Asymmetry Enables Compositional Reasoning

cs.AI · 2026-04-24 · unverdicted · novelty 6.0

Power-law data sampling creates beneficial asymmetry in the loss landscape that lets models acquire high-frequency skill compositions first, enabling more efficient learning of rare long-tail skills than uniform distributions.

HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering

cs.AI · 2026-04-22 · unverdicted · novelty 6.0

HypEHR is a hyperbolic embedding model for EHR data that uses Lorentzian geometry and hierarchy-aware pretraining to answer clinical questions nearly as well as large language models but with much smaller size.

NoisyCoconut: Counterfactual Consensus via Latent Space Reasoning

cs.LG · 2026-05-06 · unverdicted · novelty 5.0

Injecting noise into LLM latent trajectories creates diverse reasoning paths whose agreement acts as a confidence signal for selective abstention, cutting error rates from 40-70% to under 15% on math tasks.

LLM Reasoning Is Latent, Not the Chain of Thought

cs.AI · 2026-04-17 · unverdicted · novelty 5.0

LLM reasoning is primarily mediated by latent-state trajectories rather than by explicit surface chain-of-thought outputs.

Measuring AI Reasoning: A Guide for Researchers

cs.AI · 2026-05-04 · unverdicted · novelty 4.0

Reasoning in language models should be measured by the faithfulness and validity of their multi-step search processes and intermediate traces, not final-answer accuracy.

citing papers explorer

Showing 8 of 8 citing papers.

Training Transformers as a Universal Computer cs.AI · 2026-04-28 · unverdicted · none · ref 12
A transformer trained on random meaningless MicroPy programs generalizes to execute diverse human-written programs, providing empirical evidence it can act as a universal computer.
Internalized Reasoning for Long-Context Visual Document Understanding cs.CV · 2026-03-31 · unverdicted · none · ref 27
A synthetic pipeline creates and internalizes reasoning traces in VLMs for long-context visual document understanding, with a 32B model surpassing a 235B model on MMLongBenchDoc and showing 12.4x fewer output tokens.
Training Large Language Models to Reason in a Continuous Latent Space cs.CL · 2024-12-09 · unverdicted · none · ref 20
Coconut lets LLMs perform reasoning directly in continuous latent space by recycling hidden states as inputs, outperforming standard chain-of-thought on search-intensive logical tasks with better accuracy-efficiency trade-offs.
The Power of Power Law: Asymmetry Enables Compositional Reasoning cs.AI · 2026-04-24 · unverdicted · none · ref 32
Power-law data sampling creates beneficial asymmetry in the loss landscape that lets models acquire high-frequency skill compositions first, enabling more efficient learning of rare long-tail skills than uniform distributions.
HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering cs.AI · 2026-04-22 · unverdicted · none · ref 135
HypEHR is a hyperbolic embedding model for EHR data that uses Lorentzian geometry and hierarchy-aware pretraining to answer clinical questions nearly as well as large language models but with much smaller size.
NoisyCoconut: Counterfactual Consensus via Latent Space Reasoning cs.LG · 2026-05-06 · unverdicted · none · ref 42
Injecting noise into LLM latent trajectories creates diverse reasoning paths whose agreement acts as a confidence signal for selective abstention, cutting error rates from 40-70% to under 15% on math tasks.
LLM Reasoning Is Latent, Not the Chain of Thought cs.AI · 2026-04-17 · unverdicted · none · ref 4
LLM reasoning is primarily mediated by latent-state trajectories rather than by explicit surface chain-of-thought outputs.
Measuring AI Reasoning: A Guide for Researchers cs.AI · 2026-05-04 · unverdicted · none · ref 41
Reasoning in language models should be measured by the faithfulness and validity of their multi-step search processes and intermediate traces, not final-answer accuracy.

Chain of thought empowers transformers to solve inherently serial problems

fields

years

verdicts

representative citing papers

citing papers explorer