Introduces SARL benchmark showing pretrained audio encoders encode source-level spatial factors more readily than room-level factors, with patterns shaped by input configuration and training paradigm.
Wavjepa: Semantic learning unlocks ro- bust audio foundation models for raw waveforms,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
OLIVE is a new self-supervised speech representation framework that unifies view-augmented masked latent prediction with waveform reconstruction under one objective.
citing papers explorer
-
Probing Spatial Structure in Pretrained Audio Representations
Introduces SARL benchmark showing pretrained audio encoders encode source-level spatial factors more readily than room-level factors, with patterns shaped by input configuration and training paradigm.