A validation-gated framework rules out analysis for implicit suicidal intent separation but identifies a recurring low-rank semantic mid-network feature causally implicated in binary suicide detection across models and datasets, more specific than general distress.
Transformer Circuits Thread
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
A Validation-Gated Mechanistic Account of Suicidality Detection in LLMs
A validation-gated framework rules out analysis for implicit suicidal intent separation but identifies a recurring low-rank semantic mid-network feature causally implicated in binary suicide detection across models and datasets, more specific than general distress.