Correct reasoning traces exhibit positive confidence gain while incorrect traces show declining confidence, enabling CDG-based voting that boosts performance on AIME, HMMT and BRUMO benchmarks across multiple LLM architectures.
Ranked voting based self-consistency of large language models.arXiv preprint arXiv:2505.10772,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
CSI meta-scaffold unifies five LLM agent harnesses; a blackboard multi-agent system solves 19/33 cybench challenges (57.6%) versus 15/33 for the best single scaffold.
citing papers explorer
-
Inference Time Optimization with Confidence Dynamics
Correct reasoning traces exhibit positive confidence gain while incorrect traces show declining confidence, enabling CDG-based voting that boosts performance on AIME, HMMT and BRUMO benchmarks across multiple LLM architectures.