An E2E ASR model with mixed wordpieces and phonemes improves foreign proper noun recognition via phoneme-level contextual biasing, showing 16% gain over grapheme-only and 8% over wordpiece-only baselines.
Listen, attend and spell: A neural network for large vocabulary conversational speech recognition,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2019 2verdicts
UNVERDICTED 2representative citing papers
End-to-end ASR for code-switched Hindi-English with <50 hours of data shows gains from multi-task learning and corpus balancing but underperforms cascaded baselines.
citing papers explorer
-
Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models
An E2E ASR model with mixed wordpieces and phonemes improves foreign proper noun recognition via phoneme-level contextual biasing, showing 16% gain over grapheme-only and 8% over wordpiece-only baselines.
-
End-to-End ASR for Code-switched Hindi-English Speech
End-to-end ASR for code-switched Hindi-English with <50 hours of data shows gains from multi-task learning and corpus balancing but underperforms cascaded baselines.