Training Multi-Task Adversarial Network for Extracting Noise-Robust Speaker Embedding

Jianfeng Zhou , Tao Jiang , Lin Li , Qingyang Hong , Zhe Wang , Bingyin Xia

Authors on Pith no claims yet

classification 💻 cs.SD eess.AS

keywords trainingspeakeradversarialembeddingmulti-tasknoise-robustperformanceenvironments

read the original abstract

Under noisy environments, to achieve the robust performance of speaker recognition is still a challenging task. Motivated by the promising performance of multi-task training in a variety of image processing tasks, we explore the potential of multi-task adversarial training for learning a noise-robust speaker embedding. In this paper we present a novel framework which consists of three components: an encoder that extracts noise-robust speaker embedding; a classifier that classifies the speakers; a discriminator that discriminates the noise type of the speaker embedding. Besides, we propose a training strategy using the training accuracy as an indicator to stabilize the multi-class adversarial optimization process. We conduct our experiments on the English and Mandarin corpus and the experimental results demonstrate that our proposed multi-task adversarial training method could greatly outperform the other methods without adversarial training in noisy environments. Furthermore, experiments indicate that our method is also able to improve the speaker verification performance the clean condition.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Nuclear matter properties and neutron star structures from an extended linear sigma model
nucl-th 2026-03 unverdicted novelty 4.0

An extended linear sigma model with delta meson and negative sigma_piN produces a symmetry-energy plateau and stiffer EOS that satisfies neutron-star and nuclear constraints.