ParaPairAudioBench is a new pairwise benchmark showing LALM judges lag human paralinguistic judgments by 32 percentage points with poor tie calibration across style, rate, emphasis, age, and gender.
Hearing between the lines: Unlocking the reasoning power of llms for speech evaluation,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
ParaPairAudioBench: Paralinguistic Pairwise Audio Benchmark for LALM-as-a-Judge
ParaPairAudioBench is a new pairwise benchmark showing LALM judges lag human paralinguistic judgments by 32 percentage points with poor tie calibration across style, rate, emphasis, age, and gender.