A multi-view model combining ResNet-processed spectrograms, BiLSTM MFCCs, and HuBERT embeddings with context-guided cross-modal attention reaches 91.51% accuracy on speaker-independent 5-fold CV of the PC-GITA corpus for PD detection.
Physiological classification of parkinson’s disease severity using multimodal speech biomarkers with a hybrid cnn- mamba framework,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Multi-View Speech Representation Learning for Parkinson's Disease Detection Using Context-guided Cross-modal Attention
A multi-view model combining ResNet-processed spectrograms, BiLSTM MFCCs, and HuBERT embeddings with context-guided cross-modal attention reaches 91.51% accuracy on speaker-independent 5-fold CV of the PC-GITA corpus for PD detection.