Title resolution pending

Keramati, Ali, Warschauer, Mark , title = · DOI 10.5281/zenodo.17196206

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Early-Token Confidence Predicts Reasoning Quality in Multi-Agent LLM Debate

cs.CL · 2026-06-09 · unverdicted · novelty 5.0

Early-token log-probabilities from LLM decoding are stronger predictors of reasoning quality than full-sequence statistics in multi-agent debate on essay scoring tasks.

The Confident Liar: Diagnosing Multi-Agent Debate with Log-Probabilities and LLM-as-Judge

cs.CL · 2026-06-09 · unverdicted · novelty 5.0

In two-agent debate, log-probability confidence aligns with LLM-judged reasoning quality roughly twice as strongly for the Constructor (AUROC 0.804 for critical failure detection) as for the Auditor (0.634).

citing papers explorer

Showing 2 of 2 citing papers.

Early-Token Confidence Predicts Reasoning Quality in Multi-Agent LLM Debate cs.CL · 2026-06-09 · unverdicted · none · ref 15
Early-token log-probabilities from LLM decoding are stronger predictors of reasoning quality than full-sequence statistics in multi-agent debate on essay scoring tasks.
The Confident Liar: Diagnosing Multi-Agent Debate with Log-Probabilities and LLM-as-Judge cs.CL · 2026-06-09 · unverdicted · none · ref 15
In two-agent debate, log-probability confidence aligns with LLM-judged reasoning quality roughly twice as strongly for the Constructor (AUROC 0.804 for critical failure detection) as for the Auditor (0.634).

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer