CarryOnBench shows LLMs recover utility from benign intent clarifications in multi-turn talks but exhibit utility lock-in, unsafe recovery, and repetitive responses.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Useless but Safe? Benchmarking Utility Recovery with User Intent Clarification in Multi-Turn Conversations
CarryOnBench shows LLMs recover utility from benign intent clarifications in multi-turn talks but exhibit utility lock-in, unsafe recovery, and repetitive responses.