Concise thoughts: Impact of output length on llm reasoning and cost

Sania Nayab, Giulio Rossolini, Marco Simoni, Andrea Saracino, Giorgio Buttazzo, Nicolamaria Manes, Fabrizio Giacomelli · 2024 · arXiv 2407.19825

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

read on arXiv browse 10 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

When LLMs Develop Languages: Symbolic Communication for Efficient Multi-Agent Reasoning

cs.AI · 2026-06-28 · unverdicted · novelty 6.0

CLSR lets LLM agents evolve and route symbolic languages that reduce generated tokens by 3-6x versus chain-of-thought while keeping accuracy on benchmarks.

DyCon: Dynamic Reasoning Control via Evolving Difficulty Modeling

cs.AI · 2026-06-05 · unverdicted · novelty 6.0

DyCon dynamically controls reasoning depth in LRMs by modeling evolving difficulty from step-level embeddings, reducing redundant steps across multiple benchmarks.

CLORE: Content-Level Optimization for Reasoning Efficiency

cs.AI · 2026-05-21 · unverdicted · novelty 6.0

CLORE augments correct on-policy rollouts by deleting repetitive and irrelevant segments then optimizes with auxiliary DPO to improve accuracy-efficiency trade-off on math benchmarks.

Stop When Reasoning Converges: Semantic-Preserving Early Exit for Reasoning Models

cs.CL · 2026-05-17 · unverdicted · novelty 6.0

PUMA detects reasoning-level semantic redundancy to enable early exit in chains of thought, achieving 26.2% average token reduction across five LRMs and five benchmarks while preserving accuracy and CoT quality.

When Less is Enough: Efficient Inference via Collaborative Reasoning

cs.LG · 2026-05-01 · conditional · novelty 6.0

A large model generates a compact reasoning signal that a small model uses to solve tasks, reducing the large model's output tokens by up to 60% on benchmarks like AIME and GPQA.

LightThinker++: From Reasoning Compression to Memory Management

cs.CL · 2026-04-04 · unverdicted · novelty 6.0

LightThinker++ adds explicit adaptive memory management and a trajectory synthesis pipeline to LLM reasoning, cutting peak token use by ~70% while gaining accuracy in standard and long-horizon agent tasks.

CAT: Confidence-Adaptive Thinking for Efficient Reasoning of Large Reasoning Models

cs.CL · 2026-07-01 · unverdicted · novelty 5.0

CAT uses intrinsic confidence signals in preference optimization to adapt reasoning length in LRMs, outperforming uniform compression baselines on accuracy across benchmarks.

Less Back-and-Forth: A Comparative Study of Structured Prompting

cs.CL · 2026-05-19 · unverdicted · novelty 5.0

Checklist-improved prompts achieve the highest mean rubric score (7.50/8) and best quality-effort tradeoff compared to raw prompts (5.67) and clarifying-question prompts (6.67) across four task types and three LLMs.

Revisiting Chain-of-Thought Reasoning under Limited Supervision: Semi-supervised Chain-of-Thought Learning

cs.AI · 2026-07-01 · unverdicted · novelty 4.0

Semi-CoT selects low-entropy pseudo-CoT chains from unlabeled questions via answer-level semantic entropy and shows high pseudo-answer precision but only small or negative gains on math reasoning benchmarks.

Resource Consumption Threats in Large Language Models

cs.CR · 2026-03-17 · unverdicted · novelty 2.0

A systematic review of resource consumption threats in LLMs that organizes the problem along the full pipeline from threat induction to mitigation.

citing papers explorer

Showing 4 of 4 citing papers after filters.

Stop When Reasoning Converges: Semantic-Preserving Early Exit for Reasoning Models cs.CL · 2026-05-17 · unverdicted · none · ref 27
PUMA detects reasoning-level semantic redundancy to enable early exit in chains of thought, achieving 26.2% average token reduction across five LRMs and five benchmarks while preserving accuracy and CoT quality.
LightThinker++: From Reasoning Compression to Memory Management cs.CL · 2026-04-04 · unverdicted · none · ref 13
LightThinker++ adds explicit adaptive memory management and a trajectory synthesis pipeline to LLM reasoning, cutting peak token use by ~70% while gaining accuracy in standard and long-horizon agent tasks.
CAT: Confidence-Adaptive Thinking for Efficient Reasoning of Large Reasoning Models cs.CL · 2026-07-01 · unverdicted · none · ref 37
CAT uses intrinsic confidence signals in preference optimization to adapt reasoning length in LRMs, outperforming uniform compression baselines on accuracy across benchmarks.
Less Back-and-Forth: A Comparative Study of Structured Prompting cs.CL · 2026-05-19 · unverdicted · none · ref 13
Checklist-improved prompts achieve the highest mean rubric score (7.50/8) and best quality-effort tradeoff compared to raw prompts (5.67) and clarifying-question prompts (6.67) across four task types and three LLMs.

Concise thoughts: Impact of output length on llm reasoning and cost

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer