Presents HarmAmp benchmark for multi-turn harm amplification in LLMs and TrajSafe proactive monitor that reduces harm while keeping low over-refusal and preserving capabilities.
what’s up, doc?
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
LLMs produce significantly more contradictory medical conclusions from positively versus negatively framed questions than from same-framing pairs, even when grounded in identical expert-selected evidence.
citing papers explorer
-
Investigating and Alleviating Harm Amplification in LLM Interactions
Presents HarmAmp benchmark for multi-turn harm amplification in LLMs and TrajSafe proactive monitor that reduces harm while keeping low over-refusal and preserving capabilities.
-
This Treatment Works, Right? Evaluating LLM Sensitivity to Patient Question Framing in Medical QA
LLMs produce significantly more contradictory medical conclusions from positively versus negatively framed questions than from same-framing pairs, even when grounded in identical expert-selected evidence.