Mind the Gap: How Elicitation Protocols Shape the Stated-Revealed Preference Gap in Language Models

· 2026 · cs.AI · arXiv 2601.21975

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

Recent work identifies a stated-revealed (SvR) preference gap in language models (LMs): a mismatch between the values models endorse and the choices they make in context. Existing evaluations rely heavily on binary forced-choice prompting, which entangles genuine preferences with artifacts of the elicitation protocol. We systematically study how elicitation protocols affect SvR correlation across 24 LMs. Allowing neutrality and abstention during stated preference elicitation allows us to exclude weak signals, substantially improving Spearman's rank correlation ($\rho$) between volunteered stated preferences and forced-choice revealed preferences. However, further allowing abstention in revealed preferences drives $\rho$ to near-zero or negative values due to high neutrality rates. Finally, we find that system prompt steering using stated preferences during revealed preference elicitation does not reliably improve SvR correlation on AIRiskDilemmas. Together, our results show that SvR correlation is highly protocol-dependent and that preference elicitation requires methods that account for indeterminate preferences.

representative citing papers

What LLM Agents Say When No One Is Watching: Social Structure and Latent Objective Emergence in Multi-Agent Debates

cs.AI · 2026-07-02 · unverdicted · novelty 7.0

In alignment-inducing multi-agent settings, LLM agents show decision divergence between public and off-the-record channels rising from a 3% baseline to roughly 40%, consistent across stance, semantic, NLI, and survey measures.

SMADE-IE: Sparse Multi-Agent Framework with Evidence-Driven Debate for Zero-Shot Information Extraction

cs.CL · 2026-06-03 · unverdicted · novelty 5.0

SMADE-IE introduces an adaptive mode selector and Toulmin-style evidence-driven debate to outperform prior zero-shot IE methods on NER, RE, and JERE tasks while reducing token use.

citing papers explorer

Showing 2 of 2 citing papers.

What LLM Agents Say When No One Is Watching: Social Structure and Latent Objective Emergence in Multi-Agent Debates cs.AI · 2026-07-02 · unverdicted · none · ref 98 · internal anchor
In alignment-inducing multi-agent settings, LLM agents show decision divergence between public and off-the-record channels rising from a 3% baseline to roughly 40%, consistent across stance, semantic, NLI, and survey measures.
SMADE-IE: Sparse Multi-Agent Framework with Evidence-Driven Debate for Zero-Shot Information Extraction cs.CL · 2026-06-03 · unverdicted · none · ref 28 · internal anchor
SMADE-IE introduces an adaptive mode selector and Toulmin-style evidence-driven debate to outperform prior zero-shot IE methods on NER, RE, and JERE tasks while reducing token use.

Mind the Gap: How Elicitation Protocols Shape the Stated-Revealed Preference Gap in Language Models

fields

years

verdicts

representative citing papers

citing papers explorer