Still Not Quite There! Evaluating Large Language Models for Comorbid Mental Health Diagnosis

Amey Hengle; Atharva Kulkarni; Jemima Jacob; Madhumitha Chandrasekaran; Rashmi Gupta; Shantanu Patankar; Sneha D'Silva

arxiv: 2410.03908 · v1 · pith:ICMO3QIAnew · submitted 2024-10-04 · 💻 cs.CL · cs.AI

Still Not Quite There! Evaluating Large Language Models for Comorbid Mental Health Diagnosis

Amey Hengle , Atharva Kulkarni , Shantanu Patankar , Madhumitha Chandrasekaran , Sneha D'Silva , Jemima Jacob , Rashmi Gupta This is my paper

classification 💻 cs.CL cs.AI

keywords modelsangsthealthmentalclassificationlanguagepostsbenchmark

0 comments

read the original abstract

In this study, we introduce ANGST, a novel, first-of-its kind benchmark for depression-anxiety comorbidity classification from social media posts. Unlike contemporary datasets that often oversimplify the intricate interplay between different mental health disorders by treating them as isolated conditions, ANGST enables multi-label classification, allowing each post to be simultaneously identified as indicating depression and/or anxiety. Comprising 2876 meticulously annotated posts by expert psychologists and an additional 7667 silver-labeled posts, ANGST posits a more representative sample of online mental health discourse. Moreover, we benchmark ANGST using various state-of-the-art language models, ranging from Mental-BERT to GPT-4. Our results provide significant insights into the capabilities and limitations of these models in complex diagnostic scenarios. While GPT-4 generally outperforms other models, none achieve an F1 score exceeding 72% in multi-class comorbid classification, underscoring the ongoing challenges in applying language models to mental health diagnostics.

This paper has not been read by Pith yet.

Still Not Quite There! Evaluating Large Language Models for Comorbid Mental Health Diagnosis

discussion (0)