SR-CorrNet introduces an asymmetric TF-domain architecture with separation-reconstruction strategy and correlation-to-filter estimation that yields consistent gains on WSJ0-Mix, WHAMR!, and LibriCSS under anechoic, noisy-reverberant, and real-recorded conditions.
Librispeech: An ASR corpus based on public domain audio books
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Layer-wise aggregation from wav2vec 2.0 best predicts intelligibility in dysarthric speech, while time-wise aggregation is better for imprecise consonants, harsh voice, and monoloudness.
citing papers explorer
-
Asymmetric Encoder-Decoder Based on Time-Frequency Correlation for Speech Separation
SR-CorrNet introduces an asymmetric TF-domain architecture with separation-reconstruction strategy and correlation-to-filter estimation that yields consistent gains on WSJ0-Mix, WHAMR!, and LibriCSS under anechoic, noisy-reverberant, and real-recorded conditions.
-
Time vs. Layer: Locating Predictive Cues for Dysarthric Speech Descriptors in wav2vec 2.0
Layer-wise aggregation from wav2vec 2.0 best predicts intelligibility in dysarthric speech, while time-wise aggregation is better for imprecise consonants, harsh voice, and monoloudness.