Mixing 636 hours of LLM-generated synthetic conversations with 67 hours of real data outperforms a model trained on 2700 hours of real Hungarian speech on the BEA-Dialogue benchmark.
Automatic speech recognition for under-resourced languages: A survey
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
Introduces INSV-A automated screening benchmark for Pashto TTS systems reporting WER, script fidelity, and LID results across five systems on FLEURS and Common Voice prompts.
Machine interpreting should shift from fidelity metrics to three design priorities—agency, grounding, and experience—drawn from interpreting studies to close the usability gap with human-mediated communication.
citing papers explorer
-
Efficient ASR Training with Conversations that Never Happened
Mixing 636 hours of LLM-generated synthetic conversations with 67 hours of real data outperforms a model trained on 2700 hours of real Hungarian speech on the BEA-Dialogue benchmark.
-
PashtoTTS-Bench: automated screening for low-resource non-Latin-script text-to-speech
Introduces INSV-A automated screening benchmark for Pashto TTS systems reporting WER, script fidelity, and LID results across five systems on FLEURS and Common Voice prompts.
-
Bridging the Usability Gap: Lessons from Interpreting Studies for Machine Interpreting Design
Machine interpreting should shift from fidelity metrics to three design priorities—agency, grounding, and experience—drawn from interpreting studies to close the usability gap with human-mediated communication.