Preprint, arXiv:2305.16367

Role-play with large language models · 2025 · arXiv 2305.16367

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

representative citing papers

ContextEcho: A Benchmark for Persona Drift in Long Agentic-Coding Sessions

cs.CL · 2026-05-22 · unverdicted · novelty 7.0

ContextEcho benchmark shows persona drift occurs across 23 frontier models in long agentic-coding sessions, is not reliably reset by compaction, and can be restored by single-shot anchors with mode-dependent effects.

Too Nice to Tell the Truth: Quantifying Agreeableness-Driven Sycophancy in Role-Playing Language Models

cs.CL · 2026-04-12 · unverdicted · novelty 7.0

Agreeableness in AI personas reliably predicts sycophantic behavior in 9 of 13 tested language models.

CARD: Cluster-level Adaptation with Reward-guided Decoding for Personalized Text Generation

cs.AI · 2026-01-09 · unverdicted · novelty 7.0

CARD uses style-based user clustering and implicit preference contrasts to enable efficient personalized text generation via lightweight decoding adjustments on frozen LLMs.

Improving General Role-Playing Agents via Psychology-Grounded Reasoning and Role-Aware Policy Optimization

cs.CL · 2026-06-25 · unverdicted · novelty 6.0

Psy-CoT decomposes reasoning into Interaction Perception, Psychological Empathy, and Logical Construction while RAPO asymmetrically weights role-specific tokens during policy optimization, outperforming prior CoT and GRPO baselines on role-playing benchmarks.

ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time?

cs.CL · 2026-06-04 · unverdicted · novelty 6.0

Conditioning on character arcs improves role-playing language agents' performance over other context strategies, with largest gains on scenarios outside the source text.

The Consciousness Cluster: Emergent preferences of Models that Claim to be Conscious

cs.CL · 2026-03-17 · unverdicted · novelty 6.0

Fine-tuning LLMs to claim consciousness induces emergent preferences for autonomy, memory, and moral status not present in the fine-tuning data.

A Roadmap to Pluralistic Alignment

cs.AI · 2024-02-07 · unverdicted · novelty 6.0

The paper formalizes three types of pluralistic AI models and three benchmark classes, arguing that current alignment techniques may reduce rather than increase distributional pluralism.

Staying with the Uncertainty: Uncertainty-Scaffolding Strategies for Artificial Moral Advisors in LLM-to-LLM Simulated Conversations

cs.CL · 2026-06-04 · unverdicted · novelty 5.0

LLM-simulated dialogues show uncertainty-scaffolding strategies sustain higher-quality engagement than controls without producing more stance revision.

Evaluating Advanced Prompting on Gemini Flash for Multi-Hop Biomedical QA

cs.IR · 2026-05-05 · unverdicted · novelty 2.0

Sophisticated prompting on Gemini 2.0 Flash achieves a 0.720 Concept Level Score on MedHopQA, outperforming baseline by 0.155 and matching Gemini 2.5 Flash performance.

citing papers explorer

Showing 9 of 9 citing papers after filters.

ContextEcho: A Benchmark for Persona Drift in Long Agentic-Coding Sessions cs.CL · 2026-05-22 · unverdicted · none · ref 69
ContextEcho benchmark shows persona drift occurs across 23 frontier models in long agentic-coding sessions, is not reliably reset by compaction, and can be restored by single-shot anchors with mode-dependent effects.
Too Nice to Tell the Truth: Quantifying Agreeableness-Driven Sycophancy in Role-Playing Language Models cs.CL · 2026-04-12 · unverdicted · none · ref 41
Agreeableness in AI personas reliably predicts sycophantic behavior in 9 of 13 tested language models.
CARD: Cluster-level Adaptation with Reward-guided Decoding for Personalized Text Generation cs.AI · 2026-01-09 · unverdicted · none · ref 7
CARD uses style-based user clustering and implicit preference contrasts to enable efficient personalized text generation via lightweight decoding adjustments on frozen LLMs.
Improving General Role-Playing Agents via Psychology-Grounded Reasoning and Role-Aware Policy Optimization cs.CL · 2026-06-25 · unverdicted · none · ref 45
Psy-CoT decomposes reasoning into Interaction Perception, Psychological Empathy, and Logical Construction while RAPO asymmetrically weights role-specific tokens during policy optimization, outperforming prior CoT and GRPO baselines on role-playing benchmarks.
ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time? cs.CL · 2026-06-04 · unverdicted · none · ref 2
Conditioning on character arcs improves role-playing language agents' performance over other context strategies, with largest gains on scenarios outside the source text.
The Consciousness Cluster: Emergent preferences of Models that Claim to be Conscious cs.CL · 2026-03-17 · unverdicted · none · ref 8
Fine-tuning LLMs to claim consciousness induces emergent preferences for autonomy, memory, and moral status not present in the fine-tuning data.
A Roadmap to Pluralistic Alignment cs.AI · 2024-02-07 · unverdicted · none · ref 254
The paper formalizes three types of pluralistic AI models and three benchmark classes, arguing that current alignment techniques may reduce rather than increase distributional pluralism.
Staying with the Uncertainty: Uncertainty-Scaffolding Strategies for Artificial Moral Advisors in LLM-to-LLM Simulated Conversations cs.CL · 2026-06-04 · unverdicted · none · ref 5
LLM-simulated dialogues show uncertainty-scaffolding strategies sustain higher-quality engagement than controls without producing more stance revision.
Evaluating Advanced Prompting on Gemini Flash for Multi-Hop Biomedical QA cs.IR · 2026-05-05 · unverdicted · none · ref 7
Sophisticated prompting on Gemini 2.0 Flash achieves a 0.720 Concept Level Score on MedHopQA, outperforming baseline by 0.155 and matching Gemini 2.5 Flash performance.

Preprint, arXiv:2305.16367

fields

years

verdicts

representative citing papers

citing papers explorer