If the user hangs up prematurely—for example, providing actionable information and ending the call in the same turn—the agent has no opportunity to execute the required tool calls

Premature ending

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents

cs.SD · 2026-05-13 · accept · novelty 8.0

EVA-Bench introduces a simulation-plus-scoring framework for voice agents that reveals no tested system exceeds 0.5 on both accuracy and experience metrics at pass@1.

citing papers explorer

Showing 1 of 1 citing paper.

EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents cs.SD · 2026-05-13 · accept · none · ref 47
EVA-Bench introduces a simulation-plus-scoring framework for voice agents that reveals no tested system exceeds 0.5 on both accuracy and experience metrics at pass@1.

If the user hangs up prematurely—for example, providing actionable information and ending the call in the same turn—the agent has no opportunity to execute the required tool calls

fields

years

verdicts

representative citing papers

citing papers explorer