No evidence of tonal context compensation in pre-trained wav2vec2.0 embeddings; probing classifiers show some evidence in the fine-tuned model but fail to replicate human performance on isolated syllables.
Perceptual compensation for tonal context in self-supervised speech models
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
This study examines the extent to which the wav2vec2.0 architecture exhibits evidence of compensation for phonological context. We conducted a pseudo-replication of a perceptional compensation experiment on Mandarin Chinese tones, and compared the embedding similarities and probing classifier outputs between a purely self-supervised pre-trained model and a model fine-tuned for Mandarin ASR. No evidence of compensation was found in the embedding similarities of the purely pre-trained model. Probing classifiers showed some evidence of compensation in addition to the expected layer-wise improvements in categorization, but failed to replicate human performance on isolated test syllables. Our findings contrast with previous reports of sensitivity to phonological structure emerging through pre-training alone, and suggest that supervised objectives may be necessary to encourage the abstraction of at least some types of phonological regularities.
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Perceptual compensation for tonal context in self-supervised speech models
No evidence of tonal context compensation in pre-trained wav2vec2.0 embeddings; probing classifiers show some evidence in the fine-tuned model but fail to replicate human performance on isolated syllables.