SelectTSL is an end-to-end model using a Prompt-Guided Selective Attention Module and IPD enhancer to localize only prompt-specified target sounds and estimate their count and direction in complex acoustic scenes.
The in- terspeech 2020 deep noise suppression challenge: Datasets, subjective testing framework, and challenge results.arXiv preprint arXiv:2005.13981
4 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 4representative citing papers
PATSE is a DOA-guided target speaker extraction system that produces speaker-attributed streams for diarization-free ASR in multi-party conversations.
SenSE adds language-model semantic guidance to flow-matching generative speech enhancement via a dual-path masked conditioning strategy and reports SOTA results on distorted speech.
Fast-ULCNet matches original ULCNet speech enhancement quality while cutting model size by more than half and latency by 34% via FastGRNN replacement and a state-drift filter.
citing papers explorer
-
SenSE: Semantic-Aware High-Fidelity Universal Speech Enhancement
SenSE adds language-model semantic guidance to flow-matching generative speech enhancement via a dual-path masked conditioning strategy and reports SOTA results on distorted speech.