Is reasoning capability enough for safety in long-context language models?

Yu Fu, Haz Sameen Shahgir, Huanli Gong, Zhipeng Wei, N Benjamin Erichson, Yue Dong · 2026 · arXiv 2602.08874

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

representative citing papers

MT-JailBench: A Modular Benchmark for Understanding Multi-Turn Jailbreak Attacks

cs.CR · 2026-05-10 · unverdicted · novelty 6.0

MT-JailBench is a modular benchmark that standardizes evaluation of multi-turn jailbreaks to identify key success drivers and enable stronger combined attacks.

citing papers explorer

Showing 1 of 1 citing paper.

MT-JailBench: A Modular Benchmark for Understanding Multi-Turn Jailbreak Attacks cs.CR · 2026-05-10 · unverdicted · none · ref 11 · internal anchor
MT-JailBench is a modular benchmark that standardizes evaluation of multi-turn jailbreaks to identify key success drivers and enable stronger combined attacks.

Is reasoning capability enough for safety in long-context language models?

fields

years

verdicts

representative citing papers

citing papers explorer