Spgis- peech 2.0: Transcribed multi-speaker financial audio for speaker- tagged transcription

· 2025 · arXiv 2508.05554

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Audio2Tool: Speak, Call, Act -- A Dataset for Benchmarking Speech Tool Use

cs.SD · 2026-04-17 · unverdicted · novelty 6.0

Audio2Tool is a new benchmark dataset that shows speech models perform well on simple commands but degrade sharply on compositional tasks and realistic acoustic noise.

Raon-OpenTTS: Open Models and Data for Robust Text-to-Speech

eess.AS · 2026-05-20 · unverdicted · novelty 5.0

Raon-OpenTTS provides an open 510K-hour curated speech dataset and DiT-based TTS models up to 1B parameters that achieve competitive WER and speaker similarity on benchmarks versus closed models trained on millions of hours.

citing papers explorer

Showing 2 of 2 citing papers.

Audio2Tool: Speak, Call, Act -- A Dataset for Benchmarking Speech Tool Use cs.SD · 2026-04-17 · unverdicted · none · ref 25
Audio2Tool is a new benchmark dataset that shows speech models perform well on simple commands but degrade sharply on compositional tasks and realistic acoustic noise.
Raon-OpenTTS: Open Models and Data for Robust Text-to-Speech eess.AS · 2026-05-20 · unverdicted · none · ref 10
Raon-OpenTTS provides an open 510K-hour curated speech dataset and DiT-based TTS models up to 1B parameters that achieve competitive WER and speaker similarity on benchmarks versus closed models trained on millions of hours.

Spgis- peech 2.0: Transcribed multi-speaker financial audio for speaker- tagged transcription

fields

years

verdicts

representative citing papers

citing papers explorer