arxiv: 2604.19791 · v2 · submitted 2026-04-02 · 💻 cs.AI

Recognition: no theorem link

Stabilising Generative Models of Attitude Change

Jayd Matyas , William A. Cunningham , Alexander Sasha Vezhnevets , Dean Mobbs , Edgar A. Du\'e\~nez-Guzm\'an , Joel Z. Leibo

Authors on Pith no claims yet

Pith reviewed 2026-05-13 21:09 UTC · model grok-4.3

classification 💻 cs.AI

keywords attitude changecognitive dissonanceself-perceptiongenerative modelingactor simulationsverbal theoriespsychological experimentsmodel stabilization

0 comments

The pith

Translating verbal theories of attitude change into generative simulations requires an iterative stabilization process that clarifies their operational commitments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a workflow for converting influential but imprecise verbal accounts of attitude change into executable actor-environment simulations. Actors decide actions through predictive pattern completion on natural language strings, guided by theory-specific sequences of reasoning steps that implement cognitive dissonance, self-consistency, or self-perception. When tested on classic experiments, the models reproduce expected behavioral patterns, yet only after repeated adjustments to resolve ambiguities and clashes with modern linguistic assumptions. This stabilization step surfaces situational and representational dependencies left implicit in the original theories. A sympathetic reader would see this as a route to making psychological accounts precise enough to run and compare directly.

Core claim

By rendering the theories of cognitive dissonance, self-consistency, and self-perception as distinct decision logics that populate and process natural-language prefixes through theory-specific sequences of reasoning steps, the resulting simulations generate behavioural patterns consistent with known results from the original empirical literature. Achieving stable reproduction requires resolving the inherent underdetermination of the verbal accounts and the conflicts between modern linguistic priors and historical experimental assumptions. The manual process of iterative model stabilisation surfaces specific operational and socio-ecological dependencies that were largely undocumented in the原始

What carries the argument

Predictive pattern completion on natural language strings, in which actors generate action suffixes from prefixes of memories and observations, directed by theory-specific sequences of reasoning steps.

If this is right

Competing verbal theories can be placed in the same simulation framework and compared by swapping only their reasoning-step sequences.
Implementation forces explicit choices about contextual factors that were left vague in the original accounts.
The resulting models can generate fresh predictions for scenarios not covered by the historical experiments.
The stabilization workflow itself becomes a repeatable method for turning other verbal theories into runnable systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The need for stabilization may indicate that attitude change effects depend on finer socio-ecological details than the original theories stated.
Similar rendering and stabilization could be applied to verbal theories in other areas of psychology to expose comparable hidden commitments.
Modern language-model priors may systematically conflict with mid-20th-century experimental assumptions, requiring deliberate correction in any human-behavior simulation.

Load-bearing premise

The core predictions of the verbal theories survive translation into specific sequences of natural-language reasoning steps without introducing artifacts that change their implications.

What would settle it

If the stabilized simulations still fail to produce the attitude shifts reported in the classic forced-compliance or insufficient-justification experiments.

read the original abstract

Attitude change - the process by which individuals revise their evaluative stances - has been explained by a set of influential but competing verbal theories. These accounts often function as mechanism sketches: rich in conceptual detail, yet lacking the technical specifications and operational constraints required to run as executable systems. We present a generative actor-based modelling workflow for "rendering" these sketches as runnable actor - environment simulations using the Concordia simulation library. In Concordia, actors operate by predictive pattern completion: an operation on natural language strings that generates a suffix which describes the actor's intended action from a prefix containing memories of their past and observations of the present. We render the theories of cognitive dissonance (Festinger 1957), self-consistency (Aronson 1969), and self-perception (Bem 1972) as distinct decision logics that populate and process the prefix through theory-specific sequences of reasoning steps. We evaluate these implementations across classic psychological experiments. Our implementations generate behavioural patterns consistent with known results from the original empirical literature. However, we find that achieving stable reproduction requires resolving the inherent underdetermination of the verbal accounts and the conflicts between modern linguistic priors and historical experimental assumptions. We document how this manual process of iterative model "stabilisation" surfaces specific operational and socio-ecological dependencies that were largely undocumented in the original verbal accounts. Ultimately, we argue that the manual stabilisation process itself should be regarded as a core part of the methodology functioning to clarify situational and representational commitments needed to generate characteristic effects.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper turns three classic verbal theories of attitude change into distinct reasoning sequences inside Concordia agents and gets matching behavioral patterns on old experiments after manual stabilization, which the authors treat as core method rather than hidden fix.

read the letter

The main takeaway is that they've built executable versions of Festinger's cognitive dissonance, Aronson's self-consistency, and Bem's self-perception by mapping each to specific sequences of natural-language reasoning steps that populate the agent's prefix. Those implementations then produce patterns consistent with the original lab results across several classic setups. That workflow itself looks like a genuine step forward for turning mechanism sketches into runnable models inside the Concordia library.

Referee Report

3 major / 2 minor

Summary. The paper presents a generative actor-based modeling workflow using the Concordia library to render verbal theories of attitude change—cognitive dissonance (Festinger 1957), self-consistency (Aronson 1969), and self-perception (Bem 1972)—as executable simulations. Actors operate via predictive pattern completion on natural language prefixes; the theories are implemented as distinct sequences of reasoning steps that populate and process these prefixes. The implementations are evaluated on classic psychological experiments and produce behavioral patterns consistent with empirical results, but only after iterative manual stabilization to resolve underdetermination in the verbal accounts and conflicts between modern linguistic priors and historical assumptions. The authors argue that this stabilization process surfaces undocumented operational and socio-ecological dependencies and should be treated as a core methodological component.

Significance. If the stabilized implementations faithfully capture the core predictions of the original theories without artifacts from the adjustments, the work provides a concrete method for operationalizing mechanism sketches in social psychology. It demonstrates how generative simulations can identify hidden commitments in verbal theories and could support more precise, falsifiable modeling in computational social science. The explicit documentation of stabilization as methodology is a strength that could inform similar efforts to make qualitative theories runnable.

major comments (3)

[Abstract and evaluation section] Abstract and evaluation section: The claim that 'our implementations generate behavioural patterns consistent with known results' holds only after manual stabilization. The manuscript provides no comparison of outputs from the initial (unstabilized) theory-specific reasoning sequences versus the stabilized versions, nor any quantification of how the iterative adjustments alter behavior. This is load-bearing for the central claim that the verbal theories can be rendered as executable systems on their own terms, as opposed to consistency emerging from post-hoc fixes.
[Methodology description] Methodology description: The paper states that theories are rendered via 'theory-specific sequences of reasoning steps' that populate and process the prefix, yet the underdetermination acknowledged in the abstract implies multiple possible translations. Without explicit listing or pseudocode of these sequences (and how each maps to the original verbal statements of Festinger, Aronson, and Bem), it is not possible to assess whether core predictions are preserved or altered by the chosen operationalization.
[Discussion of stabilization] Discussion of stabilization: The argument that the manual stabilization process 'should be regarded as a core part of the methodology' is presented as a positive contribution, but the manuscript does not address how this affects reproducibility, generalizability across experiments, or the risk that stabilized models embed modern priors rather than historical assumptions. This weakens the claim that the workflow directly renders the verbal theories.

minor comments (2)

[Abstract and introduction] The abstract and introduction would benefit from a brief table or figure summarizing the key differences in the three theory-specific reasoning sequences before and after stabilization.
[Framework description] Notation for the prefix/suffix structure in the Concordia framework is introduced but could be formalized with a short equation or diagram for readers unfamiliar with the library.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed comments, which highlight important areas for improving the transparency of our methodology. We agree that additional documentation and comparisons will strengthen the manuscript and address the concerns raised. We respond to each major comment below and will incorporate revisions accordingly.

read point-by-point responses

Referee: [Abstract and evaluation section] Abstract and evaluation section: The claim that 'our implementations generate behavioural patterns consistent with known results' holds only after manual stabilization. The manuscript provides no comparison of outputs from the initial (unstabilized) theory-specific reasoning sequences versus the stabilized versions, nor any quantification of how the iterative adjustments alter behavior. This is load-bearing for the central claim that the verbal theories can be rendered as executable systems on their own terms, as opposed to consistency emerging from post-hoc fixes.

Authors: We agree that an explicit comparison of initial versus stabilized outputs, along with quantification of behavioral changes, would strengthen the central claim and clarify the role of stabilization. In the revised manuscript, we will add this analysis to the evaluation section, including representative examples of outputs before and after adjustments and metrics quantifying shifts in behavioral patterns. This will better demonstrate how the stabilization process contributes to reproducing empirical results. revision: yes
Referee: [Methodology description] Methodology description: The paper states that theories are rendered via 'theory-specific sequences of reasoning steps' that populate and process the prefix, yet the underdetermination acknowledged in the abstract implies multiple possible translations. Without explicit listing or pseudocode of these sequences (and how each maps to the original verbal statements of Festinger, Aronson, and Bem), it is not possible to assess whether core predictions are preserved or altered by the chosen operationalization.

Authors: We acknowledge that greater explicitness is needed to allow assessment of the operationalizations. The revised version will include detailed pseudocode for each theory-specific sequence of reasoning steps in the methodology section, with explicit mappings to the relevant passages in the original verbal statements by Festinger, Aronson, and Bem. This will enable readers to evaluate fidelity to the source theories. revision: yes
Referee: [Discussion of stabilization] Discussion of stabilization: The argument that the manual stabilization process 'should be regarded as a core part of the methodology' is presented as a positive contribution, but the manuscript does not address how this affects reproducibility, generalizability across experiments, or the risk that stabilized models embed modern priors rather than historical assumptions. This weakens the claim that the workflow directly renders the verbal theories.

Authors: We will expand the discussion section to directly address reproducibility, generalizability, and the risk of embedding modern priors. The revision will include discussion of how stabilization steps can be documented for replication purposes, suggestions for testing generalizability across additional experiments, and acknowledgment of potential influences from contemporary linguistic priors with proposed mitigation strategies such as sensitivity checks. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper renders external verbal theories (Festinger 1957, Aronson 1969, Bem 1972) as decision logics in the Concordia library and evaluates generated behaviors against independent classic experiments. The manual stabilization step is explicitly described as resolving underdetermination and surfacing undocumented dependencies, with the process itself positioned as core methodology rather than hidden fitting. No equations, self-citations, or parameter fits are invoked to force consistency by construction; the match to historical results is reported as an outcome after transparent adjustments, keeping the central claim self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on assumptions about faithful translation of verbal theories into simulation steps and the validity of natural language pattern completion for psychological processes.

free parameters (1)

theory-specific reasoning sequences
Manual choices made during stabilization to match experimental outcomes

axioms (1)

domain assumption Verbal psychological theories can be operationalized as sequences of natural language reasoning steps without loss of essential meaning
Invoked in the rendering process described in the abstract

pith-pipeline@v0.9.0 · 5593 in / 1203 out tokens · 55250 ms · 2026-05-13T21:09:18.696034+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

8 extracted references · 8 canonical work pages · 1 internal anchor

[2]

URL https: //arxiv.org/abs/2503.19786. O. Guest and A. E. Martin. How computational modeling can force theory building in psychological science.Perspectives on Psychological Science, 16(4):789–802,

work page internal anchor Pith review Pith/arXiv arXiv
[3]

J. Z. Leibo, A. S. Vezhnevets, M. Diaz, J. P. Agapiou, W. A. Cunningham, P. Sunehag, J. Haas, R. Koster, E. A. Duéñez-Guzmán, W. S. Isaac, et al. A theory of appropriateness with applications to generative artificial intelligence.arXiv preprint arXiv:2412.19010,

work page arXiv
[4]

J. Z. Leibo, A. S. Vezhnevets, M. Diaz, J. P. Agapiou, W. A. Cunningham, P. Sunehag, L. Cross, R. Koster, S. M. Bileschi, M. Chang, I. Rahwan, S. Osindero, and J. A. Evans. A theory of appropriateness that accounts for norms of rationality.arXiv preprint arXiv:2603.14050,

work page arXiv
[5]

doi: 10.1093/oxfordhb/9780199604456.013.0024. D. Marr and T. Poggio. From understanding computation to understanding neural circuitry.MIT AI Memo 357,

work page doi:10.1093/oxfordhb/9780199604456.013.0024
[6]

Personality traits in large language models

45 Stabilising Generative Models of Attitude Change G. Serapio-García, M. Safdari, C. Crepy, L. Sun, S. Fitz, M. Abdulhai, A. Faust, and M. Matarić. Personality traits in large language models.arXiv preprint arXiv:2307.00184,

work page arXiv
[7]

Can language models represent the past without anachronism?arXiv preprint arXiv:2505.00030, 2025

T. Underwood, L. K. Nelson, and M. Wilkens. Can language models represent the past without anachronism?arXiv preprint arXiv:2505.00030,

work page arXiv
[8]

A. S. Vezhnevets, J. P. Agapiou, A. Aharon, R. Ziv, J. Matyas, E. A. Duéñez-Guzmán, W. A. Cunningham, S. Osindero, D. Karmon, and J. Z. Leibo. Generative agent-based modeling with actions grounded in physical, social, or digital space using concordia.arXiv preprint arXiv:2312.03664,

work page arXiv
[9]

A. S. Vezhnevets, J. Matyas, L. Cross, D. Paglieri, M. Chang, W. A. Cunningham, S. Osindero, W. S. Isaac, and J. Z. Leibo. Multi-actor generative artificial intelligence as a game engine.arXiv preprint arXiv:2507.08892,

work page arXiv