LLM-based multi-talker ASR with dual-encoder, feature interleaving, length-aware speaker loss, and adaptive ASR threshold achieves 18% and 24% relative gains over baselines on AliMeeting and Aishell4.
The pipeline systems, whether using 3D-Speaker or DiariZen-large for diarization, achieve compa- rable results
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
eess.AS 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Balancing ASR and diarization in end-to-end LLMs for multi-talker speech recognition
LLM-based multi-talker ASR with dual-encoder, feature interleaving, length-aware speaker loss, and adaptive ASR threshold achieves 18% and 24% relative gains over baselines on AliMeeting and Aishell4.