pith. sign in

Title resolution pending

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

verdicts

UNVERDICTED 5

representative citing papers

DASB - Discrete Audio and Speech Benchmark

cs.SD · 2024-06-20 · unverdicted · novelty 7.0

DASB is a new benchmark for discrete audio tokens showing semantic tokens outperform acoustic ones but discrete representations remain less robust than continuous features across domains.

MAGE: Modality-Agnostic Music Generation and Editing

cs.SD · 2026-04-10 · unverdicted · novelty 6.0

MAGE unifies text, visual, and audio-conditioned music generation and editing in one flow-based latent model with dynamic modality masking and cross-gated control.

Two-Dimensional Quantization for Geometry-Aware Audio Coding

cs.SD · 2025-12-01 · unverdicted · novelty 6.0

Q2D2 uses 2D geometric grid projections to quantize feature pairs in neural audio codecs, yielding implicit codebooks that improve efficiency and utilization over RVQ, VQ, and FSQ while maintaining reconstruction quality.

XAttnMark: Learning Robust Audio Watermarking with Cross-Attention

cs.SD · 2025-02-06 · unverdicted · novelty 5.0

XAttnMark is a new neural audio watermarking method using partial parameter sharing, cross-attention for message retrieval, temporal conditioning, and a psychoacoustic TF masking loss that reports state-of-the-art detection and attribution robustness.

citing papers explorer

Showing 5 of 5 citing papers.

  • DASB - Discrete Audio and Speech Benchmark cs.SD · 2024-06-20 · unverdicted · none · ref 74

    DASB is a new benchmark for discrete audio tokens showing semantic tokens outperform acoustic ones but discrete representations remain less robust than continuous features across domains.

  • MAGE: Modality-Agnostic Music Generation and Editing cs.SD · 2026-04-10 · unverdicted · none · ref 22

    MAGE unifies text, visual, and audio-conditioned music generation and editing in one flow-based latent model with dynamic modality masking and cross-gated control.

  • Two-Dimensional Quantization for Geometry-Aware Audio Coding cs.SD · 2025-12-01 · unverdicted · none · ref 57

    Q2D2 uses 2D geometric grid projections to quantize feature pairs in neural audio codecs, yielding implicit codebooks that improve efficiency and utilization over RVQ, VQ, and FSQ while maintaining reconstruction quality.

  • Embedding-Based Intrusive Evaluation Metrics for Musical Source Separation Using MERT Representations eess.AS · 2026-04-22 · unverdicted · none · ref 14

    MERT embedding-based MSE and intrusive FAD metrics correlate more strongly with perceptual audio quality ratings than BSS-Eval metrics across stems and models in musical source separation.

  • XAttnMark: Learning Robust Audio Watermarking with Cross-Attention cs.SD · 2025-02-06 · unverdicted · none · ref 45

    XAttnMark is a new neural audio watermarking method using partial parameter sharing, cross-attention for message retrieval, temporal conditioning, and a psychoacoustic TF masking loss that reports state-of-the-art detection and attribution robustness.