Terms-Bench is a diagnostic benchmark for LLM negotiation agents that reveals agent-specific strategic failures beyond simple deal rates by using hidden-type simulators as oracles.
Economic behavior follows the same type-instrumental preset as Candid, but the cue channel is collapsed to neutral, noncommittal states
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.GT 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
TERMS-Bench: Diagnosing LLM Negotiation Agents Beyond Deal Rate
Terms-Bench is a diagnostic benchmark for LLM negotiation agents that reveals agent-specific strategic failures beyond simple deal rates by using hidden-type simulators as oracles.