CARE evaluates AI therapist utterances on six clinical principles via context, contrastive retrieval, and distilled reasoning, reaching 63.34 F1 on the new FAITH-M benchmark versus 38.56 for its Qwen3 backbone.
InProceedings of the 2024 Joint International Conference on Computa- tional Linguistics, Language Resources and Evalua- tion (LREC-COLING 2024), pages 5734–5746
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Measuring What Matters!! Assessing Therapeutic Principles in Mental-Health Conversation
CARE evaluates AI therapist utterances on six clinical principles via context, contrastive retrieval, and distilled reasoning, reaching 63.34 F1 on the new FAITH-M benchmark versus 38.56 for its Qwen3 backbone.