Attention-Based Models for Text-Dependent Speaker Verification

F A Rezaur Rahman Chowdhury; Ignacio Lopez Moreno; Li Wan; Quan Wang

arxiv: 1710.10470 · v3 · pith:GRLXRMSOnew · submitted 2017-10-28 · 📡 eess.AS · cs.LG· cs.SD· stat.ML

Attention-Based Models for Text-Dependent Speaker Verification

F A Rezaur Rahman Chowdhury , Quan Wang , Ignacio Lopez Moreno , Li Wan This is my paper

classification 📡 eess.AS cs.LGcs.SDstat.ML

keywords attentionattention-basedmodelsspeakerdifferentrecognitionsequencesystem

0 comments

read the original abstract

Attention-based models have recently shown great performance on a range of tasks, such as speech recognition, machine translation, and image captioning due to their ability to summarize relevant information that expands through the entire length of an input sequence. In this paper, we analyze the usage of attention mechanisms to the problem of sequence summarization in our end-to-end text-dependent speaker recognition system. We explore different topologies and their variants of the attention layer, and compare different pooling methods on the attention weights. Ultimately, we show that attention-based models can improves the Equal Error Rate (EER) of our speaker verification system by relatively 14% compared to our non-attention LSTM baseline model.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Speaker Recognition with Random Digit Strings Using Uncertainty Normalized HMM-based i-vectors
eess.AS 2019-07 unverdicted novelty 6.0

Digit-specific HMM i-vectors with uncertainty normalization reach 1.52% male and 1.77% female EER on RSR2015 part III using only that corpus and simple cosine scoring.
Self Multi-Head Attention for Speaker Recognition
cs.SD 2019-06 unverdicted novelty 6.0

Self multi-head attention applied after CNN encoding of spectrograms outperforms temporal and statistical pooling for speaker verification on VoxCeleb1 with 18% relative EER reduction.