Preference-ASR: A Preference-Aware Test Set for Benchmarking ASR in the Era of Speech LLMs

· 2026 · cs.CL · arXiv 2606.29534

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Popular ASR test sets adopt inconsistent conventions for numbers, disfluencies, entities, and casing, while standard normalizers erase the format distinctions users care about. Current benchmarks therefore cannot measure whether a model follows user preferences for output style. We introduce PreferenceASR, a test set evaluating ASR systems on their ability to follow natural-language preference instructions across four categories: normalization, entities, disfluencies, and case. Built from seven open-source corpora via a two-stage LLM-assisted pipeline with human verification, it is evaluated with a preference-aware normalizer that selectively skips steps matching the active instruction. Benchmarking four models shows rankings shift across preference types, exposing quality differences traditional evaluation obscures. We publicly release the dataset.

representative citing papers

Preference-ASR: A Preference-Aware Test Set for Benchmarking ASR in the Era of Speech LLMs

cs.CL · 2026-06-28 · unverdicted · novelty 7.0

PreferenceASR is a preference-aware ASR test set built from seven corpora that shows model rankings change when user output-style instructions are considered.

citing papers explorer

Showing 1 of 1 citing paper.

Preference-ASR: A Preference-Aware Test Set for Benchmarking ASR in the Era of Speech LLMs cs.CL · 2026-06-28 · unverdicted · none · ref 1 · internal anchor
PreferenceASR is a preference-aware ASR test set built from seven corpora that shows model rankings change when user output-style instructions are considered.

Preference-ASR: A Preference-Aware Test Set for Benchmarking ASR in the Era of Speech LLMs

fields

years

verdicts

representative citing papers

citing papers explorer