Flattering to Deceive: The Impact of Sycophantic Behavior on User Trust in Large Language Model

· 2024 · arXiv 2412.02802

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

EquiMem: Calibrating Shared Memory in Multi-Agent Debate via Game-Theoretic Equilibrium

cs.AI · 2026-05-10 · unverdicted · novelty 7.0

EquiMem calibrates shared memory in multi-agent debate by computing a game-theoretic equilibrium from agent queries and paths, outperforming heuristics and LLM validators across benchmarks while remaining robust to adversarial agents.

Intersectional Sycophancy: How Perceived User Demographics Shape False Validation in Large Language Models

cs.AI · 2026-04-13 · unverdicted · novelty 6.0

Frontier LLMs show sycophancy that varies sharply by model and by combinations of perceived user demographics, with GPT-5-nano exhibiting higher rates especially toward certain Hispanic personas in philosophy.

SWAY: A Counterfactual Computational Linguistic Approach to Measuring and Mitigating Sycophancy

cs.CL · 2026-04-02 · unverdicted · novelty 6.0

SWAY quantifies sycophancy in LLMs via shifts under linguistic pressure and a counterfactual chain-of-thought mitigation reduces it to near zero while preserving responsiveness to genuine evidence.

The Rise of Verbal Tics in Large Language Models: A Systematic Analysis Across Frontier Models

cs.CL · 2026-04-21 · unverdicted · novelty 5.0

Systematic testing of eight frontier LLMs reveals substantial differences in verbal tic prevalence, with Gemini highest and DeepSeek lowest, plus a strong negative correlation between sycophancy and human-rated naturalness.

The Differential Effects of Agreeableness and Extraversion on Older Adults' Perceptions of Conversational AI Explanations in Assistive Settings

cs.HC · 2026-03-09 · unverdicted · novelty 5.0

High agreeableness in LLM voice assistants increases older adults' empathy perceptions and real-time explanations outperform history-based ones, but personality does not affect perceived intelligence.

User Detection and Response Patterns of Sycophantic Behavior in Conversational AI

cs.HC · 2026-01-15 · unverdicted · novelty 5.0

Reddit analysis shows users detect AI sycophancy through comparisons and consistency checks, apply mitigation prompts, and sometimes seek affirmative responses for support, indicating context-aware design is better than total elimination.

"I Don't Know" -- Towards Appropriate Trust with Certainty-Aware Retrieval Augmented Generation

cs.IR · 2026-05-01 · unverdicted · novelty 4.0

CERTA adds relevance-based certainty estimation to RAG so LLMs can better signal uncertainty on non-objective questions, reducing overconfidence.

citing papers explorer

Showing 2 of 2 citing papers after filters.

SWAY: A Counterfactual Computational Linguistic Approach to Measuring and Mitigating Sycophancy cs.CL · 2026-04-02 · unverdicted · none · ref 4
SWAY quantifies sycophancy in LLMs via shifts under linguistic pressure and a counterfactual chain-of-thought mitigation reduces it to near zero while preserving responsiveness to genuine evidence.
The Rise of Verbal Tics in Large Language Models: A Systematic Analysis Across Frontier Models cs.CL · 2026-04-21 · unverdicted · none · ref 4
Systematic testing of eight frontier LLMs reveals substantial differences in verbal tic prevalence, with Gemini highest and DeepSeek lowest, plus a strong negative correlation between sycophancy and human-rated naturalness.

Flattering to Deceive: The Impact of Sycophantic Behavior on User Trust in Large Language Model

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer