Leveraging Pretrained Representations with Task-related Keywords for Alzheimer's Disease Detection
read the original abstract
With the global population aging rapidly, Alzheimer's disease (AD) is particularly prominent in older adults, which has an insidious onset and leads to a gradual, irreversible deterioration in cognitive domains (memory, communication, etc.). Speech-based AD detection opens up the possibility of widespread screening and timely disease intervention. Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations. This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features. Based on these features, the paper also proposes a novel task-oriented approach by modeling the relationship between the participants' description and the cognitive task. Experiments are carried out on the ADReSS dataset in a binary classification setup, and models are evaluated on the unseen test set. Results and comparison with recent literature demonstrate the efficiency and superior performance of proposed acoustic, linguistic and task-oriented methods. The findings also show the importance of semantic and syntactic information, and feasibility of automation and generalization with the promising audio-only and task-oriented methods for the AD detection task.
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
Do Multimodal Large Language Models Need Reasoning to Classify Dementia from Speech?
DeTAiL adaptor framework extracts internal representations from reasoning MLLMs via nonlinear adaptor and RL to outperform baselines and text-rationale methods for speech-based dementia classification.
-
DDPO-VC: Speaker De-Identification via Diffusion Denoising Policy Optimization
DDPO-VC applies diffusion denoising policy optimization with dual-teacher rewards to improve speaker de-identification while preserving cognitive utility on dementia speech benchmarks.
-
Do Multimodal Large Language Models Need Reasoning to Classify Dementia from Speech?
DeTAiL uses internal representations from reasoning MLLMs via an adaptor and RL to outperform text-rationale methods and baselines for speech-based dementia classification on two datasets.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.