Survival analysis applied to repeated jailbreak attacks on three LLMs shows one model degrades rapidly while the others maintain moderate vulnerability on HarmBench prompts.
arXiv preprint arXiv:2510.02712
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CR 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Proposes the TRIAD framework that treats multi-turn multimodal attacks as continuous trajectories and uses structural anomaly detection, regularized Mahalanobis distance, topological acceleration, and a time-varying Cox model with Bayesian HMM feedback to predict and bound expected time-to-failure.
citing papers explorer
-
Quantifying LLM Safety Degradation Under Repeated Attacks Using Survival Analysis
Survival analysis applied to repeated jailbreak attacks on three LLMs shows one model degrades rapidly while the others maintain moderate vulnerability on HarmBench prompts.
-
Surviving the Unseen: Predictive Defense for Novel Multi-Turn Multimodal Attacks
Proposes the TRIAD framework that treats multi-turn multimodal attacks as continuous trajectories and uses structural anomaly detection, regularized Mahalanobis distance, topological acceleration, and a time-varying Cox model with Bayesian HMM feedback to predict and bound expected time-to-failure.