BEARD adapts Whisper encoder for ATC domain via BEST-RQ and distillation on 5000h unlabeled speech then 2h labeled fine-tuning, delivering 12% relative WER gain over fine-tuned baseline.
wav2vec 2.0: A framework for self-supervised learning of speech rep- resentations
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
Prioritizing longest utterances in SSL speech pre-training data outperforms random or diversity-based sampling for ASR performance while using half the data volume.
citing papers explorer
-
BEST-RQ-Based Self-Supervised Learning for Whisper Domain Adaptation
BEARD adapts Whisper encoder for ATC domain via BEST-RQ and distillation on 5000h unlabeled speech then 2h labeled fine-tuning, delivering 12% relative WER gain over fine-tuned baseline.
-
A Study of Data Selection Strategies for Pre-training Self-Supervised Speech Models
Prioritizing longest utterances in SSL speech pre-training data outperforms random or diversity-based sampling for ASR performance while using half the data volume.