CLSR lets LLM agents evolve and route symbolic languages that reduce generated tokens by 3-6x versus chain-of-thought while keeping accuracy on benchmarks.
Concise thoughts: Impact of output length on llm reasoning and cost
10 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 10roles
background 1polarities
background 1representative citing papers
DyCon dynamically controls reasoning depth in LRMs by modeling evolving difficulty from step-level embeddings, reducing redundant steps across multiple benchmarks.
CLORE augments correct on-policy rollouts by deleting repetitive and irrelevant segments then optimizes with auxiliary DPO to improve accuracy-efficiency trade-off on math benchmarks.
PUMA detects reasoning-level semantic redundancy to enable early exit in chains of thought, achieving 26.2% average token reduction across five LRMs and five benchmarks while preserving accuracy and CoT quality.
A large model generates a compact reasoning signal that a small model uses to solve tasks, reducing the large model's output tokens by up to 60% on benchmarks like AIME and GPQA.
LightThinker++ adds explicit adaptive memory management and a trajectory synthesis pipeline to LLM reasoning, cutting peak token use by ~70% while gaining accuracy in standard and long-horizon agent tasks.
CAT uses intrinsic confidence signals in preference optimization to adapt reasoning length in LRMs, outperforming uniform compression baselines on accuracy across benchmarks.
Checklist-improved prompts achieve the highest mean rubric score (7.50/8) and best quality-effort tradeoff compared to raw prompts (5.67) and clarifying-question prompts (6.67) across four task types and three LLMs.
Semi-CoT selects low-entropy pseudo-CoT chains from unlabeled questions via answer-level semantic entropy and shows high pseudo-answer precision but only small or negative gains on math reasoning benchmarks.
A systematic review of resource consumption threats in LLMs that organizes the problem along the full pipeline from threat induction to mitigation.
citing papers explorer
-
Stop When Reasoning Converges: Semantic-Preserving Early Exit for Reasoning Models
PUMA detects reasoning-level semantic redundancy to enable early exit in chains of thought, achieving 26.2% average token reduction across five LRMs and five benchmarks while preserving accuracy and CoT quality.
-
LightThinker++: From Reasoning Compression to Memory Management
LightThinker++ adds explicit adaptive memory management and a trajectory synthesis pipeline to LLM reasoning, cutting peak token use by ~70% while gaining accuracy in standard and long-horizon agent tasks.
-
CAT: Confidence-Adaptive Thinking for Efficient Reasoning of Large Reasoning Models
CAT uses intrinsic confidence signals in preference optimization to adapt reasoning length in LRMs, outperforming uniform compression baselines on accuracy across benchmarks.
-
Less Back-and-Forth: A Comparative Study of Structured Prompting
Checklist-improved prompts achieve the highest mean rubric score (7.50/8) and best quality-effort tradeoff compared to raw prompts (5.67) and clarifying-question prompts (6.67) across four task types and three LLMs.