pith. sign in

hub

Let’s think dot by dot: Hidden computation in transformer language models.arXiv preprint arXiv:2404.15758

29 Pith papers cite this work. Polarity classification is still indexing.

29 Pith papers citing it

hub tools

citation-role summary

background 3

citation-polarity summary

roles

background 3

polarities

background 2 support 1

clear filters

representative citing papers

Training-Free Looped Transformers

cs.LG · 2026-05-22 · unverdicted · novelty 7.0

Training-free looped transformers retrofit recurrence to frozen models via damped ODE sub-steps on mid-stack blocks, yielding gains such as +2.64 pp on MMLU-Pro for Qwen3-4B.

PLUME: Latent Reasoning Based Universal Multimodal Embedding

cs.CV · 2026-04-02 · unverdicted · novelty 7.0

PLUME uses latent-state autoregressive rollouts and a progressive training curriculum to deliver efficient reasoning for universal multimodal embeddings without generating explicit rationales.

Enabling Agents to Communicate Entirely in Latent Space

cs.LG · 2025-11-12 · unverdicted · novelty 7.0

Interlat lets LLM agents exchange last hidden states in latent space for communication, outperforming CoT baselines across models while enabling up to 24x faster inference via compression.

Training Large Language Models to Reason in a Continuous Latent Space

cs.CL · 2024-12-09 · unverdicted · novelty 7.0

Coconut lets LLMs perform reasoning directly in continuous latent space by recycling hidden states as inputs, outperforming standard chain-of-thought on search-intensive logical tasks with better accuracy-efficiency trade-offs.

citing papers explorer

Showing 23 of 23 citing papers after filters.