pith. machine review for the scientific record. sign in

Or-bench: An over-refusal benchmark for large language models.arXiv preprint arXiv:2405.20947

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

years

2026 8

verdicts

UNVERDICTED 8

representative citing papers

IatroBench: Pre-Registered Evidence of Iatrogenic Harm from AI Safety Measures

cs.AI · 2026-04-09 · unverdicted · novelty 6.0

AI models exhibit identity-contingent withholding, providing better clinical guidance on benzodiazepine tapering to physicians than laypeople in identical scenarios, with a measured decoupling gap of +0.38 and 13.1 percentage point drop in safety-critical action hit rates.

Knowledge Distillation Must Account for What It Loses

cs.LG · 2026-04-28 · unverdicted · novelty 4.0 · 2 refs

Knowledge distillation evaluations must report lost teacher capabilities via a Distillation Loss Statement rather than relying solely on task scores.

citing papers explorer

Showing 8 of 8 citing papers.