AutoTTS discovers width-depth test-time scaling controllers through agentic search in a pre-collected trajectory environment, yielding better accuracy-cost tradeoffs than hand-designed baselines on math reasoning tasks at low cost.
Think or not? exploring thinking efficiency in large reasoning models via an information-theoretic lens
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
LLM reasoning failures cluster at early entropy-spike transitions; the GUARD inference-time framework redirects them for more reliable results.
citing papers explorer
-
LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling
AutoTTS discovers width-depth test-time scaling controllers through agentic search in a pre-collected trajectory environment, yielding better accuracy-cost tradeoffs than hand-designed baselines on math reasoning tasks at low cost.
-
Dissecting Failure Dynamics in Large Language Model Reasoning
LLM reasoning failures cluster at early entropy-spike transitions; the GUARD inference-time framework redirects them for more reliable results.