pith. sign in

arxiv: 2311.02819 · v1 · pith:QOLWEKHVnew · submitted 2023-11-06 · 💻 cs.CE

An Exploration of Multimodality and Data Augmentation for Dementia Classification

classification 💻 cs.CE
keywords dataaugmentationdementiaperformancetextaudiodatasetsimportance
0
0 comments X
read the original abstract

Dementia is a progressive neurological disorder that profoundly affects the daily lives of older adults, impairing abilities such as verbal communication and cognitive function. Early diagnosis is essential for enhancing both lifespan and quality of life for affected individuals. Despite its importance, diagnosing dementia is complex and often necessitates a multimodal approach incorporating diverse clinical data types. In this study, we fine-tune Wav2vec and Word2vec baseline models using two distinct data types: audio recordings and text transcripts. We experiment with four conditions: original datasets versus datasets purged of short sentences, each with and without data augmentation. Our results indicate that synonym-based text data augmentation generally enhances model performance, underscoring the importance of data volume for achieving generalizable performance. Additionally, models trained on text data frequently excel and can further improve the performance of other modalities when combined. Audio and timestamp data sometimes offer marginal improvements. We provide a qualitative error analysis of the sentence archetypes that tend to be misclassified under each condition, providing insights into the effects of altering data modality and augmentation decisions.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.