EVA-Bench supplies a simulation engine for bot-to-bot voice dialogues plus two composite metrics (EVA-A for accuracy, EVA-X for experience) evaluated on 213 enterprise scenarios, showing no tested system exceeds 0.5 on both pass@1 scores.
stay open
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
citation-role summary
other 1
citation-polarity summary
fields
cs.SD 1years
2026 1verdicts
UNVERDICTED 1roles
other 1polarities
unclear 1representative citing papers
citing papers explorer
-
EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents
EVA-Bench supplies a simulation engine for bot-to-bot voice dialogues plus two composite metrics (EVA-A for accuracy, EVA-X for experience) evaluated on 213 enterprise scenarios, showing no tested system exceeds 0.5 on both pass@1 scores.