wav2vec 2.0: A framework for self- supervised learning of speech representations

Alexei Baevski, Yuhao Zhou, Abdelrahman Mohamed, Michael Auli, “wav2vec 2 · 2020

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

Where Does Speech Enhancement Adapt? Probing Study Under Controlled Degradation

eess.AS · 2025-11-29 · unverdicted · novelty 6.0

Encoder layers in speech enhancement models maintain noise-invariant representations while decoder layers adapt strongly to degradation, with the pattern consistent across architectures and degradation types.

Can Hierarchical Cross-Modal Fusion Predict Human Perception of AI Dubbed Content?

eess.AS · 2026-03-30 · unverdicted · novelty 4.0

A hierarchical cross-modal fusion architecture with LoRA adapters predicts human perception of AI-dubbed clips at PCC > 0.75 after training on 12k Hindi-English examples and human MOS fine-tuning.

citing papers explorer

Showing 2 of 2 citing papers.

Where Does Speech Enhancement Adapt? Probing Study Under Controlled Degradation eess.AS · 2025-11-29 · unverdicted · none · ref 22
Encoder layers in speech enhancement models maintain noise-invariant representations while decoder layers adapt strongly to degradation, with the pattern consistent across architectures and degradation types.
Can Hierarchical Cross-Modal Fusion Predict Human Perception of AI Dubbed Content? eess.AS · 2026-03-30 · unverdicted · none · ref 19
A hierarchical cross-modal fusion architecture with LoRA adapters predicts human perception of AI-dubbed clips at PCC > 0.75 after training on 12k Hindi-English examples and human MOS fine-tuning.

wav2vec 2.0: A framework for self- supervised learning of speech representations

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer