and Afreen, Samina and Blasko, Barbara and Brazile, Tiffany L

Draelos, Rachel L · 2026 · DOI 10.1038/s41746-026-02428-5

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open at publisher browse 1 citing papers

representative citing papers

MultiTurnPSB: Evaluating Multi-Turn Jailbreak Attacks an dClassifier-Based Defenses for Medical AI Safety

cs.CR · 2026-05-30 · unverdicted · novelty 7.0

Multi-turn jailbreak attacks on medical AI increase unsafe responses from 35% to 80% by turn 4, expose 19x model gaps invisible in single-turn tests, and a lightweight classifier reduces unsafe outputs by 52 points at the cost of 45% false alarms on benign queries.

citing papers explorer

Showing 1 of 1 citing paper.

MultiTurnPSB: Evaluating Multi-Turn Jailbreak Attacks an dClassifier-Based Defenses for Medical AI Safety cs.CR · 2026-05-30 · unverdicted · none · ref 7
Multi-turn jailbreak attacks on medical AI increase unsafe responses from 35% to 80% by turn 4, expose 19x model gaps invisible in single-turn tests, and a lightweight classifier reduces unsafe outputs by 52 points at the cost of 45% false alarms on benign queries.

and Afreen, Samina and Blasko, Barbara and Brazile, Tiffany L

fields

years

verdicts

representative citing papers

citing papers explorer