pith. sign in

Alleviating attention hacking in discriminative reward modeling through interaction distillation

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

years

2026 4

verdicts

UNVERDICTED 4

clear filters

representative citing papers

Relevant Is Not Warranted: Evidence-Force Calibration for Cited RAG

cs.AI · 2026-05-27 · unverdicted · novelty 7.0

FORCEBENCH shows model judges often violate expected ordering on evidence-calibrated vs force-raised claim pairs, with standard support prompting yielding 47.2% MVR and explicit warrant prompting reducing it to 24.5%.

citing papers explorer

Showing 2 of 2 citing papers after filters.