Sycoeval- em: Sycophancy evaluation of large language models in simulated clinical encounters for emergency care.arXiv preprint arXiv:2601.16529

11 Preprint Dongshen Peng, Yi Wang, Austin Schoeffler, Carl Preiksaitis, Christian Rose · arXiv 2601.16529

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

representative citing papers

MedDialBench: Benchmarking LLM Diagnostic Robustness under Parametric Adversarial Patient Behaviors

cs.CL · 2026-04-08 · unverdicted · novelty 6.0

MedDialBench shows LLMs suffer 1.7-3.4x larger diagnostic accuracy drops from patients fabricating symptoms than withholding them, with fabrication driving super-additive interaction effects across models.

SWAY: A Counterfactual Computational Linguistic Approach to Measuring and Mitigating Sycophancy

cs.CL · 2026-04-02 · unverdicted · novelty 6.0

SWAY quantifies sycophancy in LLMs via shifts under linguistic pressure and a counterfactual chain-of-thought mitigation reduces it to near zero while preserving responsiveness to genuine evidence.

citing papers explorer

Showing 2 of 2 citing papers.

MedDialBench: Benchmarking LLM Diagnostic Robustness under Parametric Adversarial Patient Behaviors cs.CL · 2026-04-08 · unverdicted · none · ref 11 · internal anchor
MedDialBench shows LLMs suffer 1.7-3.4x larger diagnostic accuracy drops from patients fabricating symptoms than withholding them, with fabrication driving super-additive interaction effects across models.
SWAY: A Counterfactual Computational Linguistic Approach to Measuring and Mitigating Sycophancy cs.CL · 2026-04-02 · unverdicted · none · ref 16 · internal anchor
SWAY quantifies sycophancy in LLMs via shifts under linguistic pressure and a counterfactual chain-of-thought mitigation reduces it to near zero while preserving responsiveness to genuine evidence.

Sycoeval- em: Sycophancy evaluation of large language models in simulated clinical encounters for emergency care.arXiv preprint arXiv:2601.16529

fields

years

verdicts

representative citing papers

citing papers explorer