CLSR lets LLM agents evolve and route symbolic languages that reduce generated tokens by 3-6x versus chain-of-thought while keeping accuracy on benchmarks.
Concise thoughts: Impact of output length on llm reasoning and cost
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 7roles
background 1polarities
background 1representative citing papers
CLORE augments correct on-policy rollouts by deleting repetitive and irrelevant segments then optimizes with auxiliary DPO to improve accuracy-efficiency trade-off on math benchmarks.
PUMA detects reasoning-level semantic redundancy to enable early exit in chains of thought, achieving 26.2% average token reduction across five LRMs and five benchmarks while preserving accuracy and CoT quality.
A large model generates a compact reasoning signal that a small model uses to solve tasks, reducing the large model's output tokens by up to 60% on benchmarks like AIME and GPQA.
LightThinker++ adds explicit adaptive memory management and a trajectory synthesis pipeline to LLM reasoning, cutting peak token use by ~70% while gaining accuracy in standard and long-horizon agent tasks.
Checklist-improved prompts achieve the highest mean rubric score (7.50/8) and best quality-effort tradeoff compared to raw prompts (5.67) and clarifying-question prompts (6.67) across four task types and three LLMs.
A systematic review of resource consumption threats in LLMs that organizes the problem along the full pipeline from threat induction to mitigation.
citing papers explorer
-
When LLMs Develop Languages: Symbolic Communication for Efficient Multi-Agent Reasoning
CLSR lets LLM agents evolve and route symbolic languages that reduce generated tokens by 3-6x versus chain-of-thought while keeping accuracy on benchmarks.
-
CLORE: Content-Level Optimization for Reasoning Efficiency
CLORE augments correct on-policy rollouts by deleting repetitive and irrelevant segments then optimizes with auxiliary DPO to improve accuracy-efficiency trade-off on math benchmarks.
-
Stop When Reasoning Converges: Semantic-Preserving Early Exit for Reasoning Models
PUMA detects reasoning-level semantic redundancy to enable early exit in chains of thought, achieving 26.2% average token reduction across five LRMs and five benchmarks while preserving accuracy and CoT quality.
-
When Less is Enough: Efficient Inference via Collaborative Reasoning
A large model generates a compact reasoning signal that a small model uses to solve tasks, reducing the large model's output tokens by up to 60% on benchmarks like AIME and GPQA.
-
LightThinker++: From Reasoning Compression to Memory Management
LightThinker++ adds explicit adaptive memory management and a trajectory synthesis pipeline to LLM reasoning, cutting peak token use by ~70% while gaining accuracy in standard and long-horizon agent tasks.
-
Less Back-and-Forth: A Comparative Study of Structured Prompting
Checklist-improved prompts achieve the highest mean rubric score (7.50/8) and best quality-effort tradeoff compared to raw prompts (5.67) and clarifying-question prompts (6.67) across four task types and three LLMs.
-
Resource Consumption Threats in Large Language Models
A systematic review of resource consumption threats in LLMs that organizes the problem along the full pipeline from threat induction to mitigation.