SPHERE: An evaluation card for human-AI systems

Dora Zhao, Qianou Ma, Xinran Zhao, Chenglei Si, Chenyang Yang, Ryan Louie, Ehud Reiter, Diyi Yang, Tongshuang Wu · 2025 · DOI 10.18653/v1/2025.findings-acl.70

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Evaluation Cards: An Interpretive Layer for AI Evaluation Reporting

cs.AI · 2026-06-08 · unverdicted · novelty 6.0

EvalCards is a composable reporting schema and monitoring tool for AI evaluations, derived from 52 papers and 10 interviews, and applied to 5,816 models and 101,843 results to surface reporting gaps.

Breakdowns in Conversational AI: Interactional Failures in Emotionally and Ethically Sensitive Contexts

cs.CL · 2026-04-03 · unverdicted · novelty 5.0

Mainstream conversational models show escalating affective misalignments and ethical guidance failures during staged emotional trajectories, organized into a taxonomy of interactional breakdowns.

citing papers explorer

Showing 2 of 2 citing papers.

Evaluation Cards: An Interpretive Layer for AI Evaluation Reporting cs.AI · 2026-06-08 · unverdicted · none · ref 116
EvalCards is a composable reporting schema and monitoring tool for AI evaluations, derived from 52 papers and 10 interviews, and applied to 5,816 models and 101,843 results to surface reporting gaps.
Breakdowns in Conversational AI: Interactional Failures in Emotionally and Ethically Sensitive Contexts cs.CL · 2026-04-03 · unverdicted · none · ref 47
Mainstream conversational models show escalating affective misalignments and ethical guidance failures during staged emotional trajectories, organized into a taxonomy of interactional breakdowns.

SPHERE: An evaluation card for human-AI systems

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer