CleanCodec reframes audio tokenization as a selective information bottleneck to encode only perceptually important features at 12.5 tokens per second, outperforming prior codecs in efficiency, speaker similarity, and intelligibility.
WHAM!: Extending speech separation to noisy environments
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
SR-CorrNet introduces an asymmetric TF-domain architecture with separation-reconstruction strategy and correlation-to-filter estimation that yields consistent gains on WSJ0-Mix, WHAMR!, and LibriCSS under anechoic, noisy-reverberant, and real-recorded conditions.
citing papers explorer
-
Asymmetric Encoder-Decoder Based on Time-Frequency Correlation for Speech Separation
SR-CorrNet introduces an asymmetric TF-domain architecture with separation-reconstruction strategy and correlation-to-filter estimation that yields consistent gains on WSJ0-Mix, WHAMR!, and LibriCSS under anechoic, noisy-reverberant, and real-recorded conditions.