The Benefits of a Concise Chain of Thought on Problem-Solving in Large Language Models , url=

Renze, Matthew, Guven, Erhan , year= · 2024 · arXiv 3129.2024

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Open Datasets in Learning Analytics: Trends, Challenges, and Best PRACTICE

cs.CY · 2026-02-19 · accept · novelty 8.0

A survey of 172 open educational datasets from 204 papers across LAK, EDM, and AIED conferences reveals trends, 143 previously uncatalogued datasets, field gaps, and an 8-item PRACTICE checklist for better data publication.

Supervising Ralph Wiggum: Exploring a Metacognitive Co-Regulation Agentic AI Loop for Engineering Design

cs.AI · 2026-03-25 · conditional · novelty 7.0

Metacognitive self- and co-regulation loops improve LLM agent performance in engineering design by mitigating fixation and enabling better exploration of design options.

Contract Based Verification of Non-functional Requirements for Embedded Automotive C Code

cs.PL · 2026-05-19 · unverdicted · novelty 6.0

The authors define general non-functional rules for C modules, propose an interface contract language, implement a Frama-C checker plugin, and demonstrate verification on two Scania truck codebases alongside ACSL functional contracts.

Stop When Reasoning Converges: Semantic-Preserving Early Exit for Reasoning Models

cs.CL · 2026-05-17 · unverdicted · novelty 6.0

PUMA detects reasoning-level semantic redundancy to enable early exit in chains of thought, achieving 26.2% average token reduction across five LRMs and five benchmarks while preserving accuracy and CoT quality.

Adversarial Arena: Crowdsourcing Data Generation through Interactive Competition

cs.AI · 2026-04-20 · unverdicted · novelty 6.0

Adversarial competition between attacker and defender teams generates diverse multi-turn conversational data that improves LLM performance on secure code generation benchmarks by 18-29%.

LLM4C2Rust: Large Language Models for Automated Memory-Safe Code Transpilation

cs.SE · 2026-04-16 · unverdicted · novelty 5.0

A RAG-enhanced LLM pipeline with segmentation improves C-to-Rust transpilation correctness and eliminates raw pointer dereferences and unsafe type casts in several Coreutils programs.

SAT: Balancing Reasoning Accuracy and Efficiency with Stepwise Adaptive Thinking

cs.AI · 2026-04-09 · unverdicted · novelty 5.0

SAT reduces reasoning tokens by up to 40% across multiple large reasoning models and benchmarks by adaptively pruning steps based on difficulty while maintaining or improving accuracy.

Understanding the Self-Reflection Mechanisms of LLMs through Biased Attitude Associations

cs.SI · 2026-05-30 · unverdicted · novelty 4.0

ReBias-Lens shows LLM self-reflection produces layer-wise smoothing of global valence fluctuations that reduces behavioral bias overall, yet selectively locks in and amplifies certain category-specific biases.

IntentScore: Intent-Conditioned Action Evaluation for Computer-Use Agents

cs.AI · 2026-04-06 · 2 refs

citing papers explorer

Showing 4 of 4 citing papers after filters.

Supervising Ralph Wiggum: Exploring a Metacognitive Co-Regulation Agentic AI Loop for Engineering Design cs.AI · 2026-03-25 · conditional · none · ref 47
Metacognitive self- and co-regulation loops improve LLM agent performance in engineering design by mitigating fixation and enabling better exploration of design options.
Adversarial Arena: Crowdsourcing Data Generation through Interactive Competition cs.AI · 2026-04-20 · unverdicted · none · ref 45
Adversarial competition between attacker and defender teams generates diverse multi-turn conversational data that improves LLM performance on secure code generation benchmarks by 18-29%.
SAT: Balancing Reasoning Accuracy and Efficiency with Stepwise Adaptive Thinking cs.AI · 2026-04-09 · unverdicted · none · ref 28
SAT reduces reasoning tokens by up to 40% across multiple large reasoning models and benchmarks by adaptively pruning steps based on difficulty while maintaining or improving accuracy.
IntentScore: Intent-Conditioned Action Evaluation for Computer-Use Agents cs.AI · 2026-04-06 · unreviewed · ref 14 · 2 links

The Benefits of a Concise Chain of Thought on Problem-Solving in Large Language Models , url=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer