L-layer transformers under Log-ICoT curriculum provably learn k-parity with poly(n) samples and log k stages, matching explicit CoT efficiency without inference overhead.
hub
Let’s think dot by dot: Hidden computation in transformer language models.arXiv preprint arXiv:2404.15758
29 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 3representative citing papers
Corruption studies of CoT faithfulness largely measure explicit answer placement in prompt format rather than computational importance of reasoning steps.
In-distribution sampling across 25 models and controlled interventions with DAG-verified content show that semantic reasoning and validation content, not token count, drive CoT gains.
PearlVLA achieves SOTA on LIBERO by separating VLM representations into visual grounding and an iterative latent plan branch refined via world model queries and RefineNet with process-reward RL.
SWITCH uses explicit <swi> and </swi> boundary tokens to make latent chain-of-thought compatible with on-policy RL (GRPO) and open to causal mechanistic probing, outperforming prior hidden-state recurrence methods.
RiM trains LLMs to perform latent reasoning via fixed memory blocks processed in one forward pass using a two-stage curriculum, matching or exceeding prior latent methods on benchmarks.
CoT probe-time gains arise primarily from lexical activation and short-range token co-occurrence rather than sentence-level logical derivation.
Training-free looped transformers retrofit recurrence to frozen models via damped ODE sub-steps on mid-stack blocks, yielding gains such as +2.64 pp on MMLU-Pro for Qwen3-4B.
CopT reverses CoT by eliciting a draft answer first then using continuous-embedding contrastive verification and on-policy thinking to reflect and correct, yielding up to 23% higher accuracy and 57% fewer tokens without training.
Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.
PLUME uses latent-state autoregressive rollouts and a progressive training curriculum to deliver efficient reasoning for universal multimodal embeddings without generating explicit rationales.
Interlat lets LLM agents exchange last hidden states in latent space for communication, outperforming CoT baselines across models while enabling up to 24x faster inference via compression.
A learned continue-thinking token, trained via RL on its embedding alone, improves math benchmark accuracy more than fixed-token budget forcing in a frozen language model.
Coconut lets LLMs perform reasoning directly in continuous latent space by recycling hidden states as inputs, outperforming standard chain-of-thought on search-intensive logical tasks with better accuracy-efficiency trade-offs.
LARM enables test-time compute scaling in non-autoregressive ASR via depth-conditioned looping with CTC checkpoints, supervision embeddings, FiLM conditioning, and posterior feedback, yielding lower WER on LibriSpeech with more loops.
CIRF tokenizes CoT traces into functional units, fine-tunes models to autoregressively emit these tokens plus optional results, and reports improved accuracy-latency trade-offs on math, symbolic, and commonsense benchmarks.
Premature confidence in LLM chains of thought predicts flawed reasoning and is mitigated by progressive confidence shaping, a label-free RL objective that yields accuracy gains on arithmetic, math, and science tasks.
Reasoning language models extract answers from sparse, order-shuffled chain-of-thought traces with little accuracy loss.
SeLaR selectively applies latent soft reasoning in LLMs via entropy gating and contrastive regularization, outperforming standard CoT on five benchmarks without training.
MPS proposes a dual-brain architecture separating formulation reasoning from articulation to achieve real-time CoT in SLMs with accuracy comparable to full pre-computation but much lower latency.
Agentic LLM collectives are proposed as natural-language-interpretable computational substrates for ALife research.
DiscoLoop adds a discrete embedding channel to looped transformers to fix representational misalignment in two-hop reasoning, yielding near-perfect accuracy on synthetic tasks and better pretraining loss on real data.
Latent Recurrent Transformer augments autoregressive transformers with a cross-layer recurrent latent pathway from prior hidden states and uses interleaved parallel training to improve loss and in-context learning at ~0.3% extra parameters.
Injecting noise into LLM latent trajectories creates diverse reasoning paths whose agreement acts as a confidence signal for selective abstention, cutting error rates from 40-70% to under 15% on math tasks.