On the Importance of Temporal Context in Proximity Kernels: A Vocal Separation Case Study

Delia Fano Yela; Derry FitzGerald; Mark Sandler; Sebastian Ewert

arxiv: 1702.02130 · v2 · pith:QYI7B2PFnew · submitted 2017-02-07 · 💻 cs.SD

On the Importance of Temporal Context in Proximity Kernels: A Vocal Separation Case Study

Delia Fano Yela , Sebastian Ewert , Derry FitzGerald , Mark Sandler This is my paper

classification 💻 cs.SD

keywords separationcontextkernelkernelsbinsframesinformationsimilarity

0 comments

read the original abstract

Musical source separation methods exploit source-specific spectral characteristics to facilitate the decomposition process. Kernel Additive Modelling (KAM) models a source applying robust statistics to time-frequency bins as specified by a source-specific kernel, a function defining similarity between bins. Kernels in existing approaches are typically defined using metrics between single time frames. In the presence of noise and other sound sources information from a single-frame, however, turns out to be unreliable and often incorrect frames are selected as similar. In this paper, we incorporate a temporal context into the kernel to provide additional information stabilizing the similarity search. Evaluated in the context of vocal separation, our simple extension led to a considerable improvement in separation quality compared to previous kernels.

This paper has not been read by Pith yet.

On the Importance of Temporal Context in Proximity Kernels: A Vocal Separation Case Study

discussion (0)