LambdaMark is the first generic radioactive audio watermark that injects multi-bit messages into semantic latent representations, achieving robustness to distortions and removal attacks even after downstream model finetuning.
NISQA: A deep CNN-Self-Attention model for multidimensional speech quality prediction with crowdsourced datasets,
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
SenSE adds language-model semantic guidance to flow-matching generative speech enhancement via a dual-path masked conditioning strategy and reports SOTA results on distorted speech.
HybridCodec combines discrete tokens with continuous residuals via a focal modulation codec and hybrid Transformer to improve speaker retention and reduce autoregressive steps in speech language models.
citing papers explorer
-
LambdaMark: Semantic Audio Watermarking for Robustness and Radioactivity
LambdaMark is the first generic radioactive audio watermark that injects multi-bit messages into semantic latent representations, achieving robustness to distortions and removal attacks even after downstream model finetuning.
-
SenSE: Semantic-Aware High-Fidelity Universal Speech Enhancement
SenSE adds language-model semantic guidance to flow-matching generative speech enhancement via a dual-path masked conditioning strategy and reports SOTA results on distorted speech.
-
HybridCodec: Modeling Discrete and Continuous Representations for Efficient Speech Language Models
HybridCodec combines discrete tokens with continuous residuals via a focal modulation codec and hybrid Transformer to improve speaker retention and reduce autoregressive steps in speech language models.