pith. sign in

Think-bench: Evaluat- ing thinking efficiency and chain-of-thought quality of large reasoning models

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

citation-role summary

method 1

citation-polarity summary

years

2026 5 2025 2

roles

method 1

polarities

use method 1

clear filters

representative citing papers

RecurGuard: Runtime Monitoring for Reasoning-Token Consumption Attacks

cs.CR · 2026-06-06 · unverdicted · novelty 6.0

RecurGuard monitors recurrence rate, volume growth, and query progress in exposed reasoning traces to terminate generation on token-consumption attacks, reporting 99% detection on OverThink and 92% on ExtendAttack with near-zero false positives.

CRANE: Constrained Reasoning Injection for Code Agents via Nullspace Editing

cs.SE · 2026-05-13 · unverdicted · novelty 6.0 · 2 refs

CRANE applies magnitude thresholding, a Conservative Taylor Gate, and Graduated Sigmoidal Projection to the Thinking-Instruct delta to improve code agent pass rates on Roo-Eval, SWE-bench-Verified, and Terminal-Bench while preserving efficiency.

citing papers explorer

Showing 1 of 1 citing paper after filters.