NISQA: A deep CNN-Self-Attention model for multidimensional speech quality prediction with crowdsourced datasets,

· 2021 · DOI 10.21437/interspeech.2021-299

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open at publisher browse 3 citing papers

representative citing papers

LambdaMark: Semantic Audio Watermarking for Robustness and Radioactivity

cs.SD · 2026-06-19 · unverdicted · novelty 8.0

LambdaMark is the first generic radioactive audio watermark that injects multi-bit messages into semantic latent representations, achieving robustness to distortions and removal attacks even after downstream model finetuning.

SenSE: Semantic-Aware High-Fidelity Universal Speech Enhancement

eess.AS · 2025-09-29 · unverdicted · novelty 6.0

SenSE adds language-model semantic guidance to flow-matching generative speech enhancement via a dual-path masked conditioning strategy and reports SOTA results on distorted speech.

HybridCodec: Modeling Discrete and Continuous Representations for Efficient Speech Language Models

cs.LG · 2026-06-26 · unverdicted · novelty 5.0

HybridCodec combines discrete tokens with continuous residuals via a focal modulation codec and hybrid Transformer to improve speaker retention and reduce autoregressive steps in speech language models.

citing papers explorer

Showing 3 of 3 citing papers.

LambdaMark: Semantic Audio Watermarking for Robustness and Radioactivity cs.SD · 2026-06-19 · unverdicted · none · ref 53
LambdaMark is the first generic radioactive audio watermark that injects multi-bit messages into semantic latent representations, achieving robustness to distortions and removal attacks even after downstream model finetuning.
SenSE: Semantic-Aware High-Fidelity Universal Speech Enhancement eess.AS · 2025-09-29 · unverdicted · none · ref 22
SenSE adds language-model semantic guidance to flow-matching generative speech enhancement via a dual-path masked conditioning strategy and reports SOTA results on distorted speech.
HybridCodec: Modeling Discrete and Continuous Representations for Efficient Speech Language Models cs.LG · 2026-06-26 · unverdicted · none · ref 52
HybridCodec combines discrete tokens with continuous residuals via a focal modulation codec and hybrid Transformer to improve speaker retention and reduce autoregressive steps in speech language models.

NISQA: A deep CNN-Self-Attention model for multidimensional speech quality prediction with crowdsourced datasets,

fields

years

verdicts

representative citing papers

citing papers explorer