EvalCards is a composable reporting schema and monitoring tool for AI evaluations, derived from 52 papers and 10 interviews, and applied to 5,816 models and 101,843 results to surface reporting gaps.
SPHERE: An evaluation card for human-AI systems
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
Mainstream conversational models show escalating affective misalignments and ethical guidance failures during staged emotional trajectories, organized into a taxonomy of interactional breakdowns.
citing papers explorer
-
Evaluation Cards: An Interpretive Layer for AI Evaluation Reporting
EvalCards is a composable reporting schema and monitoring tool for AI evaluations, derived from 52 papers and 10 interviews, and applied to 5,816 models and 101,843 results to surface reporting gaps.
-
Breakdowns in Conversational AI: Interactional Failures in Emotionally and Ethically Sensitive Contexts
Mainstream conversational models show escalating affective misalignments and ethical guidance failures during staged emotional trajectories, organized into a taxonomy of interactional breakdowns.