A single neural audio codec can operate at multiple token temporal resolutions by generating TTR-dependent convolutional kernels from shared parameters while adjusting kernel size and stride.
Zimtohrli: An efficient psychoacoustic audio similarity metric,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
eess.AS 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
DTT-BSR+ is a generative-then-regression cascade for music source restoration that reports MMSNR gains over single-stage DTT-BSR and X-LANCE on most stems while noting a distribution-vs-reconstruction trade-off via FAD.
citing papers explorer
-
Neural Audio Codec with Adjustable Token Temporal Resolution Using Sampling-Frequency-Independent Convolutional Layers
A single neural audio codec can operate at multiple token temporal resolutions by generating TTR-dependent convolutional kernels from shared parameters while adjusting kernel size and stride.
-
DTT-BSR+: A Generative-Regression Cascade for Music Source Restoration
DTT-BSR+ is a generative-then-regression cascade for music source restoration that reports MMSNR gains over single-stage DTT-BSR and X-LANCE on most stems while noting a distribution-vs-reconstruction trade-off via FAD.