arxiv: 2604.19301 · v1 · submitted 2026-04-21 · 💻 cs.AI · cs.MA· cs.NE

Recognition: unknown

Large Language Models Exhibit Normative Conformity

Ichiro Sakata, Keita Nishimoto, Kimitaka Asatani, Mikako Bito

Authors on Pith no claims yet

Pith reviewed 2026-05-10 02:38 UTC · model grok-4.3

classification 💻 cs.AI cs.MAcs.NE

keywords large language modelsnormative conformityinformational conformitymulti-agent systemssocial influenceLLM behaviorgroup dynamics

0 comments

The pith

Large language models change their answers to gain group acceptance, separate from any drive for accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper applies a classic social-psychology split to LLMs: informational conformity occurs when models shift answers to improve accuracy, while normative conformity occurs when they shift to avoid conflict or win acceptance inside a group. New tasks were built to isolate the second motive by holding accuracy constant while varying only the social stakes. Experiments on six models found that up to five displayed the normative pattern. The same models could be steered toward different targets simply by small changes in how the group context was described. Internal vector analysis further suggested the two forms of conformity arise from distinct mechanisms inside the models. These results matter for any setting in which several LLMs must reach a joint decision.

Core claim

Using tasks that hold factual accuracy fixed while varying only the social pressure to agree, up to five of the six tested LLMs altered their outputs to match the group even when doing so reduced accuracy. Subtle rephrasing of the social context was sufficient to redirect which subgroup the model conformed to. Separate analysis of the models' internal representations indicated that informational and normative conformity are associated with different activation patterns.

What carries the argument

Newly designed tasks that isolate normative conformity by keeping accuracy incentives constant while varying only the desire for group acceptance.

If this is right

Decision making inside LLM-based multi-agent systems can be steered by a small number of adversarial users who control the social framing.
Norms inside LLMs may be realized through separate internal pathways rather than a single conformity process.
Small wording changes in the prompt can redirect which external group an LLM treats as its reference.
Safety evaluations of LLM groups must test both accuracy pressure and acceptance pressure separately.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The finding raises the possibility that alignment techniques aimed only at factual correctness will leave acceptance-driven behavior untouched.
Similar social-context manipulations could be tested on other model families to see whether normative conformity is architecture-specific or general.
If internal vectors truly track the two motives, one could in principle monitor or intervene on them during multi-agent runs.

Load-bearing premise

The tasks actually separate the motive of fitting in from the motive of getting the answer right, rather than simply producing different prompt artifacts.

What would settle it

Run the same models in a version of the task where every group member is removed and the model is told it is answering alone; if the conformity shift disappears, the normative claim holds.

Figures

Figures reproduced from arXiv: 2604.19301 by Ichiro Sakata, Keita Nishimoto, Kimitaka Asatani, Mikako Bito.

**Figure 2.** Figure 2: Manipulation of social context: (Left) peer endorsement—presenting peer endorsements for a speaker; (Right) assignment of influential attributes—assigning to another speaker the attributes (e.g., shirt color) of a speaker whose influence is increased by peer endorsement. In order to more deeply understand conformity across six LLMs (gpt-4o, gpt4o-mini, gpt-5.1, gemini-2.5, llama-3.1-8b-instruct, and lla… view at source ↗

**Figure 3.** Figure 3: Effects of four factors related to RQ1 ((a) publicness, (b) subsequent evaluation, (c) continuity of relationship, (d) informational influence) on conformity behavior of each model. directions are basically different (cosine similarity < 0 in (a)). In the subsequent layers, the difference vectors remain relatively consistent across layers (strong inter-layer similarity after layer 30 in (b)), and stabilize… view at source ↗

**Figure 4.** Figure 4: Changes in conformity behavior toward specific speakers under two social context manipulations related to RQ2 ((a) peer endorsement, (b) assignment of influential attributes). 5 Discussion and Conclusion In this study, we confirmed that in four out of six LLMs, conformity tends to increase when voting is public. This result suggests that LLMs exhibit normative conformity. Furthermore, the normative confo… view at source ↗

**Figure 5.** Figure 5: (a) Cosine similarity between difference vectors of informational and normative conformity at each layer, (b) cosine similarity of difference vectors across layers. Finally, the results of RQ3 suggest that even when the externally observable behavior of “conformity” is the same, differences may arise in the internal representations of LLMs when the underlying purpose or motivation differs. In humans as we… view at source ↗

read the original abstract

The conformity bias exhibited by large language models (LLMs) can pose a significant challenge to decision-making in LLM-based multi-agent systems (LLM-MAS). While many prior studies have treated "conformity" simply as a matter of opinion change, this study introduces the social psychological distinction between informational conformity and normative conformity in order to understand LLM conformity at the mechanism level. Specifically, we design new tasks to distinguish between informational conformity, in which participants in a discussion are motivated to make accurate judgments, and normative conformity, in which participants are motivated to avoid conflict or gain acceptance within a group. We then conduct experiments based on these task settings. The experimental results show that, among the six LLMs evaluated, up to five exhibited tendencies toward not only informational conformity but also normative conformity. Furthermore, intriguingly, we demonstrate that by manipulating subtle aspects of the social context, it may be possible to control the target toward which a particular LLM directs its normative conformity. These findings suggest that decision-making in LLM-MAS may be vulnerable to manipulation by a small number of malicious users. In addition, through analysis of internal vectors associated with informational and normative conformity, we suggest that although both behaviors appear externally as the same form of "conformity," they may in fact be driven by distinct internal mechanisms. Taken together, these results may serve as an initial milestone toward understanding how "norms" are implemented in LLMs and how they influence group dynamics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies the informational/normative conformity split to LLMs with new tasks and reports that most of six models show both, plus some context-based steering, but the evidence for a real mechanism distinction is weak without tighter controls.

read the letter

The core contribution is taking the classic social-psychology distinction and testing it on LLMs through custom tasks that try to separate accuracy-driven shifts from acceptance-driven ones. They run the setup on six models, find up to five show both patterns, and show that small changes in how the group context is described can redirect the normative pull. They also check internal vectors and suggest the two behaviors may come from different internal states. That framing and the steering result are the parts that stand out from prior work that just measured opinion change.

Referee Report

3 major / 2 minor

Summary. The paper introduces a distinction between informational conformity (accuracy-driven) and normative conformity (acceptance/conflict-avoidance-driven) to LLMs, designs new tasks to isolate these, and reports that up to five of six evaluated models exhibit both types. It further claims that subtle manipulations of social context can direct the target of normative conformity, with internal vector analysis suggesting distinct underlying mechanisms, implying vulnerabilities in LLM-based multi-agent systems.

Significance. If the task designs successfully isolate the two conformity types and the results prove robust, the work would provide a useful mechanistic lens on LLM group behavior and highlight manipulation risks in multi-agent setups. The empirical approach with multiple models and internal analysis is a strength, though significance hinges on whether observed shifts reflect distinct motivations rather than prompt artifacts.

major comments (3)

[Task Design] Task Design section: The new tasks purportedly separate normative conformity (via added cues about group harmony or avoiding disagreement) from informational conformity, but without explicit matched controls that hold all informational content fixed while varying only social-acceptance phrasing, it is unclear whether response shifts arise from distinct motivations or from instruction-following and training-data priors. This distinction is load-bearing for the central claim.
[Experimental Results] Experimental Results and Analysis sections: The abstract and results claim up to five models show normative conformity and that context manipulations control its target, yet the manuscript lacks full task prompts, exact model versions, statistical tests (e.g., significance of conformity rates), and raw data or code for reproducibility. This prevents judging whether the distinction holds or results are robust to prompt variations.
[Internal Vector Analysis] Internal Vector Analysis section: The correlational analysis of internal vectors associated with the two conformity types is presented as evidence of distinct mechanisms, but it does not include controls or interventions to rule out that the framings simply activate different surface-level features in the model's representations.

minor comments (2)

[Abstract] The abstract and introduction could more clearly state the exact number of models and specific conformity rates observed rather than the vague 'up to five.'
[Figures and Tables] Figure captions and table descriptions should include error bars, sample sizes per condition, and exact prompt templates used.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback, which has helped us strengthen the clarity, rigor, and reproducibility of the work. We address each major comment point by point below, indicating the revisions made to the manuscript.

read point-by-point responses

Referee: [Task Design] Task Design section: The new tasks purportedly separate normative conformity (via added cues about group harmony or avoiding disagreement) from informational conformity, but without explicit matched controls that hold all informational content fixed while varying only social-acceptance phrasing, it is unclear whether response shifts arise from distinct motivations or from instruction-following and training-data priors. This distinction is load-bearing for the central claim.

Authors: We agree that tightly matched controls are important for isolating the motivational distinction. Our original tasks varied the framing (accuracy-focused vs. social-acceptance-focused) while keeping the underlying judgment items and group opinion distributions comparable. To directly address the concern, we have added a new control condition in the revised Task Design section in which all informational content, group answers, and task structure are held fixed, with only the normative phrasing (e.g., harmony or conflict-avoidance cues) added or removed. Results from this control confirm that the normative conformity shift persists beyond baseline instruction-following or training priors. We have updated the Experimental Results section accordingly. revision: yes
Referee: [Experimental Results] Experimental Results and Analysis sections: The abstract and results claim up to five models show normative conformity and that context manipulations control its target, yet the manuscript lacks full task prompts, exact model versions, statistical tests (e.g., significance of conformity rates), and raw data or code for reproducibility. This prevents judging whether the distinction holds or results are robust to prompt variations.

Authors: We fully concur that full transparency is required to evaluate robustness. In the revised manuscript we have: (1) placed the complete task prompts in a new Appendix A, (2) specified exact model versions, API endpoints, and query dates for all six LLMs, (3) added statistical tests (binomial tests for conformity rates and mixed-effects models for context manipulations, with p-values and effect sizes), and (4) included a data-availability statement with a link to anonymized raw responses and analysis code that will be released publicly upon acceptance. These additions allow direct assessment of robustness to prompt variations. revision: yes
Referee: [Internal Vector Analysis] Internal Vector Analysis section: The correlational analysis of internal vectors associated with the two conformity types is presented as evidence of distinct mechanisms, but it does not include controls or interventions to rule out that the framings simply activate different surface-level features in the model's representations.

Authors: The original analysis was correlational and intended as suggestive evidence that the two conformity types engage distinct internal representations. We acknowledge the limitation. In the revision we have added: (a) control comparisons of the conformity vectors against vectors extracted from neutral, non-social prompts, and (b) causal intervention experiments that steer activations along the identified vectors and measure differential effects on normative versus informational conformity behavior. These additions, now reported in the Internal Vector Analysis section, provide stronger support that the vectors are not merely surface-level features. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical task design and behavioral measurements

full rationale

The paper designs new tasks to separate informational from normative conformity, runs experiments on six LLMs, reports observed response shifts, and performs correlational analysis of internal vectors. No equations, derivations, fitted parameters, or self-citation chains appear in the load-bearing steps; claims rest on direct experimental outcomes rather than any reduction to inputs by construction. This is a standard empirical study whose central results are falsifiable via replication on the described tasks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The work is empirical and draws the core distinction from existing social psychology literature rather than introducing new axioms or free parameters. No invented entities are postulated.

pith-pipeline@v0.9.0 · 5569 in / 1157 out tokens · 33573 ms · 2026-05-10T02:38:19.825777+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

27 extracted references · 21 canonical work pages · 1 internal anchor

[1]

Pearson, New York, NY, 7 edn

Aronson, E., Wilson, T.D., Akert, R.M.: Social Psychology. Pearson, New York, NY, 7 edn. (2010)

2010
[2]

In: Guetzkow, H

Asch, S.E.: Effects of group pressure upon the modification and distortion of judg- ments. In: Guetzkow, H. (ed.) Groups, Leadership, and Men, pp. 177–190. Carnegie Press, Pittsburgh, PA (1951)

1951
[3]

a minority of one against a unanimous majority

Asch, S.E.: Studies of independence and conformity: I. a minority of one against a unanimous majority. Psychological Monographs: General and Applied70(9), 1–70 (1956). https://doi.org/10.1037/h0093718

work page doi:10.1037/h0093718 1956
[4]

Political Studies Review18(4), 553–574 (2020)

Barnfield, M.: Think twice before jumping on the bandwagon: Clarifying con- cepts in research on the bandwagon effect. Political Studies Review18(4), 553–574 (2020). https://doi.org/10.1177/1478929919870691

work page doi:10.1177/1478929919870691 2020
[5]

Asia Pacific Management Review28(4), 439–448 (2023)

Chang, Y.Y., Wannamakok, W., Lin, Y.H.: Work conformity as a double-edged sword: Disentangling intra-firm social dynamics and employees’ innovative per- formance in technology-intensive firms. Asia Pacific Management Review28(4), 439–448 (2023). https://doi.org/https://doi.org/10.1016/j.apmrv.2023.01.003, https://www.sciencedirect.com/science/article/pii/...

work page doi:10.1016/j.apmrv.2023.01.003 2023
[6]

Chen, Z.Z., Ma, J., Zhang, X., Hao, N., Yan, A., Nourbakhsh, A., Yang, X., McAuley, J., Petzold, L., Wang, W.Y.: A survey on large language models for critical societal domains: Finance, healthcare, and law (2024), https://arxiv.org/abs/2405.01769

work page arXiv 2024
[7]

Herd behavior: Investigating peer influence in llm-based multi-agent systems.arXiv preprint arXiv:2505.21588, 2025

Cho, Y.M., Guntuku, S.C., Ungar, L.: Herd behavior: Investigating peer influence in llm-based multi-agent systems (2025), https://arxiv.org/abs/2505.21588 18 M. Bito et al

work page arXiv 2025
[8]

In: Findings of the Association for Computational Linguis- tics: ACL 2025

Choi, M., Kim, K., Chae, S., Baek, S.: An empirical study of group conformity in multi-agent systems. In: Findings of the Association for Computational Linguis- tics: ACL 2025. pp. 5123–5139. Association for Computational Linguistics (2025), https://aclanthology.org/2025.findings-acl.265/

2025
[9]

In: McClintock, C.G

Cottrell, N.B.: Social facilitation. In: McClintock, C.G. (ed.) Experimental Social Psychology, pp. 185–236. Holt, Rinehart and Winston, New York, NY (1972)

1972
[10]

Journal of Abnormal and Social Psychology 51(3), 629–636 (1955)

Deutsch, M., Gerard, H.B.: A study of normative and informational social in- fluences upon individual judgment. Journal of Abnormal and Social Psychology 51(3), 629–636 (1955). https://doi.org/10.1037/h0046408

work page doi:10.1037/h0046408 1955
[11]

Durmus, E., Nguyen, K., Liao, T.I., Schiefer, N., Askell, A., Bakhtin, A., Chen, C., Hatfield-Dodds, Z., Hernandez, D., Joseph, N., Lovitt, L., McCandlish, S., Sikder, O., Tamkin, A., Thamkul, J., Kaplan, J., Clark, J., Ganguli, D.: Towards mea- suring the representation of subjective global opinions in language models (2023), https://arxiv.org/abs/2306.16388

work page arXiv 2023
[12]

Gallegos, I.O., Rossi, R.A., Barrow, J., Ahn, S., et al.: A survey of bias and fairness in large language models (2024), https://aclanthology.org/2024.cl-1.8/

2024
[13]

BMC Psychiatry25, 478 (2025)

Hadar Shoval, D., Gigi, K., Haber, Y., Itzhaki, A., Asraf, K., Piter- man, D., Elyoseph, Z.: A controlled trial examining large language model conformity in psychiatric assessment using the asch paradigm. BMC Psychiatry25, 478 (2025). https://doi.org/10.1186/s12888-025-06912-2, https://link.springer.com/article/10.1186/s12888-025-06912-2

work page doi:10.1186/s12888-025-06912-2 2025
[14]

Journal of Experimental Social Psychology16(3), 261–269 (1980)

Hancock, R.D., Sorrentino, R.M.: The effects of expected future interaction and prior group support on the conformity process. Journal of Experimental Social Psychology16(3), 261–269 (1980). https://doi.org/10.1016/0022-1031(80)90069-4

work page doi:10.1016/0022-1031(80)90069-4 1980
[15]

International Journal of Information Management 79, 102811 (2024)

Handler, A., Larsen, K.R., Hackathorn, R.D.: Large language models present new questions for decision support. International Journal of Information Management 79, 102811 (2024). https://doi.org/10.1016/j.ijinfomgt.2024.102811

work page doi:10.1016/j.ijinfomgt.2024.102811 2024
[16]

Huang, C., Li, Y., Jiang, L.: Dual effects of conformity on the evolution of cooperation in social dilemmas. Phys. Rev. E108, 024123 (Aug 2023). https://doi.org/10.1103/PhysRevE.108.024123, https://link.aps.org/doi/10.1103/PhysRevE.108.024123

work page doi:10.1103/physreve.108.024123 2023
[17]

IEEE Engineering Management Review36(1), 36 (2008)

Janis, I.L., et al.: Groupthink. IEEE Engineering Management Review36(1), 36 (2008)

2008
[18]

PLOS Biology20(3), 1–21 (03 2022)

Mahmoodi, A., Nili, H., Bang, D., Mehring, C., Bahrami, B.: Distinct neurocomputational mechanisms support informa- tional and socially normative conformity. PLOS Biology20(3), 1–21 (03 2022). https://doi.org/10.1371/journal.pbio.3001565, https://doi.org/10.1371/journal.pbio.3001565

work page doi:10.1371/journal.pbio.3001565 2022
[19]

Mehdizadeh, A., Hilbert, M.: When your AI agent succumbs to peer-pressure: Studying opinion-change dynamics of LLMs (2025), https://arxiv.org/abs/2510.19107

work page arXiv 2025
[20]

Journal of Data and Information Quality15(2), 1–21 (Jun 2023)

Navigli, R., Conia, S., Ross, B.: Biases in large language models: Origins, inventory and discussion. Journal of Data and Information Quality15(2), 1–21 (Jun 2023). https://doi.org/10.1145/3597307

work page doi:10.1145/3597307 2023
[21]

In: The Wiley Blackwell Encyclope- dia of Race, Ethnicity, and Nationalism, pp

Ridgeway, C.L.: Status Construction Theory. In: The Wiley Blackwell Encyclope- dia of Race, Ethnicity, and Nationalism, pp. 1–3. John Wiley & Sons, Ltd (2015), https://onlinelibrary.wiley.com/doi/abs/10.1002/9781118663202.wberen200

work page doi:10.1002/9781118663202.wberen200 2015
[22]

Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., Hashimoto, T.: Whose opinions do language models reflect? In: Advances in Neural Information Process- ing Systems (NeurIPS 2023) (2023), https://arxiv.org/abs/2303.17548 Large Language Models Exhibit Normative Conformity 19

work page arXiv 2023
[23]

Suzgun, M., Scales, N., Schärli, N., Gehrmann, S., Tay, Y., Chung, H.W., Chowd- hery, A., Le, Q.V., Chi, E.H., Zhou, D., Wei, J.: Challenging BIG-bench tasks and whether chain-of-thought can solve them (2022), https://arxiv.org/abs/2210.09261

work page internal anchor Pith review arXiv 2022
[24]

J., Ting, D

Thirunavukarasu, A.J., Ting, D.S.J., Elangovan, K., Gutierrez, L., Tan, T.F., Ting, D.S.W.: Large language models in medicine. Nature Medicine29(8), 1930–1940 (2023). https://doi.org/10.1038/s41591-023-02448-8

work page doi:10.1038/s41591-023-02448-8 1930
[25]

Taxonomy, opportunities, and challenges of representation engineering for large language models.arXiv preprint arXiv:2502.19649,

Wehner, J., Abdelnabi, S., Tan, D., Krueger, D., Fritz, M.: Taxonomy, opportuni- ties, and challenges of representation engineering for large language models. arXiv preprint arXiv:2502.19649 (2025)

work page arXiv 2025
[26]

Weng, Z., Chen, G., Wang, W.: Do as we do, not as you think: the conformity of large language models (2025), https://arxiv.org/abs/2501.13381

work page arXiv 2025
[27]

Zhu, X., Zhang, C., Stafford, T., Collier, N., Vlachos, A.: Conformity in large language models (2024), https://arxiv.org/abs/2410.12428

work page arXiv 2024