2505.10772 , archiveprefix =

Weiqin Wang, Yile Wang, Hui Huang , year = · 2025 · arXiv 2505.10772

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

cs.CL · 2026-06-03 · unverdicted · novelty 6.0

RISC reformulates self-consistency answer selection as a ranking task solved by a lightweight LambdaRank model with five hand-designed features, yielding better accuracy-efficiency trade-offs than majority voting on QA benchmarks.

Inference Time Optimization with Confidence Dynamics

cs.CL · 2026-05-24 · unverdicted · novelty 6.0

Correct reasoning traces exhibit positive confidence gain while incorrect traces show declining confidence, enabling CDG-based voting that boosts performance on AIME, HMMT and BRUMO benchmarks across multiple LLM architectures.

Towards Cybersecurity SuperIntelligence (CSI): What's the best harness for cybersecurity?

cs.CR · 2026-05-27 · unverdicted · novelty 4.0

CSI meta-scaffold unifies five LLM agent harnesses; a blackboard multi-agent system solves 19/33 cybench challenges (57.6%) versus 15/33 for the best single scaffold.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Towards Cybersecurity SuperIntelligence (CSI): What's the best harness for cybersecurity? cs.CR · 2026-05-27 · unverdicted · none · ref 8
CSI meta-scaffold unifies five LLM agent harnesses; a blackboard multi-agent system solves 19/33 cybench challenges (57.6%) versus 15/33 for the best single scaffold.

2505.10772 , archiveprefix =

fields

years

verdicts

representative citing papers

citing papers explorer