A speech-text alignment method generates expressive pseudo-audio prompts for effective text-only domain adaptation in LLM-based ASR, outperforming prior text-only approaches on error rates and OOV coverage.
JOIST: A joint speech and text streaming model for asr,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Refining Pseudo-Audio Prompts with Speech-Text Alignment for Text-Only Domain Adaptation in LLM-Based ASR
A speech-text alignment method generates expressive pseudo-audio prompts for effective text-only domain adaptation in LLM-based ASR, outperforming prior text-only approaches on error rates and OOV coverage.