Music source separation in the waveform domain

Francis Bach · 1911 · arXiv 1911.13254

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

ArtifactNet: Detecting AI-Generated Music via Forensic Residual Physics

cs.SD · 2026-04-17 · unverdicted · novelty 7.0

ArtifactNet extracts codec residuals from spectrograms with a 4M-parameter network to detect AI music at F1=0.9829 and 1.49% FPR on unseen tracks from 22 generators, outperforming larger baselines.

The Spheres Dataset: Multitrack Orchestral Recordings for Music Source Separation and Information Retrieval

eess.AS · 2025-11-26 · accept · novelty 7.0

The Spheres dataset provides multitrack orchestral recordings with isolated instrument stems and acoustic characterizations to support supervised machine learning for music source separation in the classical domain.

High Fidelity Neural Audio Compression

eess.AS · 2022-10-24 · accept · novelty 7.0

EnCodec is an end-to-end trained streaming neural audio codec that uses a single multiscale spectrogram discriminator and a gradient-normalizing loss balancer to achieve higher fidelity than prior methods at the same bitrates for 24 kHz mono and 48 kHz stereo audio.

MAGE: Modality-Agnostic Music Generation and Editing

cs.SD · 2026-04-10 · unverdicted · novelty 6.0

MAGE unifies text, visual, and audio-conditioned music generation and editing in one flow-based latent model with dynamic modality masking and cross-gated control.

Discrete Token Modeling for Multi-Stem Music Source Separation with Language Models

eess.AS · 2026-04-10 · unverdicted · novelty 6.0

A Conformer-conditioned decoder-only language model generates discrete tokens via a neural audio codec to separate four music stems, reaching near state-of-the-art perceptual quality and top NISQA on vocals in MUSDB18-HQ tests.

CodecSep: Prompt-Driven Universal Sound Separation on Neural Audio Codec Latents

cs.SD · 2025-09-15 · unverdicted · novelty 6.0

CodecSep performs prompt-driven universal sound separation directly in neural audio codec latents by combining a frozen DAC backbone with a lightweight FiLM-conditioned Transformer masker driven by CLAP embeddings, yielding efficiency gains over AudioSep.

Improving Music Source Separation with Diffusion and Consistency Refinement

cs.SD · 2024-12-09 · unverdicted · novelty 6.0

Diffusion-based refinement followed by consistency distillation improves music source separation quality and inference speed across U-Net and BS-RoFormer backbones on Slakh2100 and MUSDB18.

citing papers explorer

Showing 7 of 7 citing papers.

ArtifactNet: Detecting AI-Generated Music via Forensic Residual Physics cs.SD · 2026-04-17 · unverdicted · none · ref 19
ArtifactNet extracts codec residuals from spectrograms with a 4M-parameter network to detect AI music at F1=0.9829 and 1.49% FPR on unseen tracks from 22 generators, outperforming larger baselines.
The Spheres Dataset: Multitrack Orchestral Recordings for Music Source Separation and Information Retrieval eess.AS · 2025-11-26 · accept · none · ref 9
The Spheres dataset provides multitrack orchestral recordings with isolated instrument stems and acoustic characterizations to support supervised machine learning for music source separation in the classical domain.
High Fidelity Neural Audio Compression eess.AS · 2022-10-24 · accept · none · ref 8
EnCodec is an end-to-end trained streaming neural audio codec that uses a single multiscale spectrogram discriminator and a gradient-normalizing loss balancer to achieve higher fidelity than prior methods at the same bitrates for 24 kHz mono and 48 kHz stereo audio.
MAGE: Modality-Agnostic Music Generation and Editing cs.SD · 2026-04-10 · unverdicted · none · ref 5
MAGE unifies text, visual, and audio-conditioned music generation and editing in one flow-based latent model with dynamic modality masking and cross-gated control.
Discrete Token Modeling for Multi-Stem Music Source Separation with Language Models eess.AS · 2026-04-10 · unverdicted · none · ref 16
A Conformer-conditioned decoder-only language model generates discrete tokens via a neural audio codec to separate four music stems, reaching near state-of-the-art perceptual quality and top NISQA on vocals in MUSDB18-HQ tests.
CodecSep: Prompt-Driven Universal Sound Separation on Neural Audio Codec Latents cs.SD · 2025-09-15 · unverdicted · none · ref 11
CodecSep performs prompt-driven universal sound separation directly in neural audio codec latents by combining a frozen DAC backbone with a lightweight FiLM-conditioned Transformer masker driven by CLAP embeddings, yielding efficiency gains over AudioSep.
Improving Music Source Separation with Diffusion and Consistency Refinement cs.SD · 2024-12-09 · unverdicted · none · ref 6
Diffusion-based refinement followed by consistency distillation improves music source separation quality and inference speed across U-Net and BS-RoFormer backbones on Slakh2100 and MUSDB18.

Music source separation in the waveform domain

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer