The paper delivers a unified framework for fairness in speech technologies by formalizing seven definitions, organizing research into three paradigms, diagnosing pipeline-specific biases, and mapping mitigations to those sources.
Quantifying bias in automatic speech recognition
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2representative citing papers
ASR bias causes users from underrepresented dialects to internalize failures as personal inadequacy and perform extensive emotional and linguistic labor, revealing harms missed by accuracy-only evaluations.
ST models override masculine ILM biases with acoustic input, using first-person pronouns to link terms to the speaker and accessing gender cues across the full frequency spectrum rather than pitch alone.
The authors perform the first systematic bias evaluation in speech continuation tasks across three models, revealing gender interactions in text metrics and stronger reversion to modal phonation for female prompts.
Few-shot TTS adaptation combined with LLM-guided phoneme editing produces synthetic accented speech that improves ASR word error rates on real accented audio even in cross-speaker and ultra-low-data settings.
Omnimodal models show reduced demographic bias in image and video tasks compared to substantial biases and lower performance in audio tasks.
citing papers explorer
-
Toward Fair Speech Technologies: A Comprehensive Survey of Bias and Fairness in Speech AI
The paper delivers a unified framework for fairness in speech technologies by formalizing seven definitions, organizing research into three paradigms, diagnosing pipeline-specific biases, and mapping mitigations to those sources.
-
"This Wasn't Made for Me": Recentering User Experience and Emotional Impact in the Evaluation of ASR Bias
ASR bias causes users from underrepresented dialects to internalize failures as personal inadequacy and perform extensive emotional and linguistic labor, revealing harms missed by accuracy-only evaluations.
-
Voice, Bias, and Coreference: An Interpretability Study of Gender in Speech Translation
ST models override masculine ILM biases with acoustic input, using first-person pronouns to link terms to the speaker and accessing gender cues across the full frequency spectrum rather than pitch alone.
-
Speak Your Mind: The Speech Continuation Task as a Probe of Voice-Based Model Bias
The authors perform the first systematic bias evaluation in speech continuation tasks across three models, revealing gender interactions in text metrics and stronger reversion to modal phonation for female prompts.
-
Few-Shot Accent Synthesis for ASR with LLM-Guided Phoneme Editing
Few-shot TTS adaptation combined with LLM-guided phoneme editing produces synthetic accented speech that improves ASR word error rates on real accented audio even in cross-speaker and ultra-low-data settings.
-
Demographic and Linguistic Bias Evaluation in Omnimodal Language Models
Omnimodal models show reduced demographic bias in image and video tasks compared to substantial biases and lower performance in audio tasks.