Gaussian mixture models as a proxy for interacting language models

Avanti Athreya; Carey Priebe; Edward L. Wang; Hayden Helm; Mohammad Sharifi Kiasari; Tianyu Wang; Vince Lyzinski

arxiv: 2506.00077 · v4 · submitted 2025-05-29 · 💻 cs.CL · cs.LG· stat.ML

Gaussian mixture models as a proxy for interacting language models

Edward L. Wang , Mohammad Sharifi Kiasari , Tianyu Wang , Hayden Helm , Avanti Athreya , Carey Priebe , Vince Lyzinski This is my paper

Pith reviewed 2026-05-19 11:49 UTC · model grok-4.3

classification 💻 cs.CL cs.LGstat.ML

keywords Gaussian mixture modelslarge language modelspolarizationMarkov chainsretrieval-augmented generationproxy modelsinteracting systems

0 comments

The pith

A system of interacting Gaussian mixture models serves as a low-cost proxy for interacting large language models by mimicking their feedback-driven responses and enabling proofs of polarization bounds.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The authors construct a system where Gaussian mixture models generate, exchange, and update data and parameters in a way analogous to retrieval-augmented generation in LLMs. This setup is computationally cheap compared to running actual language models. They model the interactions as a Markov chain and define polarization in this context. Lower bounds on the probability that polarization occurs are then proved for the chain. This approach provides theoretical insight into using simple statistical models to study complex AI interaction dynamics.

Core claim

The paper establishes that an interacting system of Gaussian mixture models, equipped with an analogue to retrieval-augmented generation, can replicate certain aspects of simulations involving interacting large language models whose responses depend on feedback from others, and that a Markov chain formulation of this system admits lower bounds on the probability of polarization.

What carries the argument

The interacting system of Gaussian mixture models with RAG analogue, formalized as a Markov chain for analyzing polarization.

If this is right

The GMM proxy runs at minimal computational cost while capturing key interaction dynamics.
Lower bounds on polarization probability can be derived directly from the Markov chain properties.
This allows theoretical analysis of group polarization in AI systems without full LLM simulations.
The model supports iterative updating of parameters based on exchanged data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Such proxies might extend to studying other emergent behaviors like consensus or disagreement in AI collectives.
Connecting this to real-world social dynamics could inform designs for more balanced AI discussion systems.
Experimental validation against actual LLM runs would strengthen the case for using GMMs in larger studies.

Load-bearing premise

The interacting Gaussian mixture model system with its RAG analogue sufficiently captures the relevant dynamics of interacting large language models to study polarization.

What would settle it

A direct comparison experiment showing that the GMM interactions fail to produce polarization patterns similar to those observed in simulations of interacting LLMs.

Figures

Figures reproduced from arXiv: 2506.00077 by Avanti Athreya, Carey Priebe, Edward L. Wang, Hayden Helm, Mohammad Sharifi Kiasari, Tianyu Wang, Vince Lyzinski.

**Figure 2.** Figure 2: Comparison of the effect of k on the number of silos for p = 0 between the GMM and LLM simulations. Left. The result of the GMM simulation with T = 80, and r = 5. Each value of k was simulated 50 times with the line indicating the average and the shaded region indicating +/- 5 SE. Right. The result of the LLM simulation from figure 4 of McGuinness et al. Additionally, McGuinness et al. provide a qualitativ… view at source ↗

**Figure 3.** Figure 3: Example systems of n = 30 interacting agents for varying values of p and k in both the GMM and LLM case. The value of p is constant for each row and the value of k is constant for each column. We see in both cases that increasing p decreases the rate of agent silo changes and slows the development of global alignment. Small values of k prevent global alignment and formations of single silos. 8 [PITH_FULL_… view at source ↗

**Figure 4.** Figure 4: Example plot of the convergences to one silo for large [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: Effect of the distance between Gaussian means on the time it takes to converge to a single [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

read the original abstract

Large language models (LLMs) are powerful tools that, in a number of settings, overlap with the results of human pattern recognition and reasoning. Retrieval-augmented generation (RAG) further allows LLMs to produce tailored output depending on the contents of their RAG databases. However, LLMs depend on complex, computationally expensive algorithms. In this paper, we introduce interacting Gaussian mixture models (GMMs) as a proxy for interacting LLMs. We construct a model of interacting GMMs, complete with an analogue to RAG updating, under which GMMs can generate, exchange, and update data and parameters. We show that this interacting system of Gaussian mixture models, which can be implemented at minimal computational cost, mimics certain aspects of experimental simulations of interacting LLMs whose iterative responses depend on feedback from other LLMs. We build a Markov chain from this system of interacting GMMs; formalize and interpret the notion of polarization for such a chain; and prove lower bounds on the probability of polarization. This provides theoretical insight into the use of interacting Gaussian mixture models as a computationally efficient proxy for interacting large language models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a cheap GMM construction with RAG-style updates and proves polarization lower bounds on the resulting Markov chain, but the claim that it mimics LLM interaction dynamics rests on the setup itself rather than any shown correspondence or empirical check.

read the letter

This paper sets up a system of interacting Gaussian mixture models that exchange data and parameters in a way meant to stand in for RAG-augmented LLMs. It turns the whole thing into a Markov chain, defines polarization for that chain, and proves lower bounds on the probability that polarization occurs. The main draw is that the model runs at very low cost compared with actual LLM simulations, which could make it practical for studying group-level effects like polarization without heavy compute.

Referee Report

2 major / 2 minor

Summary. The paper introduces interacting Gaussian mixture models (GMMs) equipped with a retrieval-augmented generation (RAG) analogue as a low-cost proxy for interacting large language models (LLMs). It constructs a system in which GMMs generate, exchange, and update data and parameters, claims that this system mimics certain aspects of LLM feedback-driven simulations, derives a Markov chain from the interaction rules, formalizes polarization for the chain, and proves lower bounds on the probability of polarization.

Significance. If the GMM construction can be shown to reproduce polarization for reasons that transfer from the specific mixing and update rules to LLM embedding or token-level feedback, the work supplies a computationally cheap simulation platform together with rigorous probabilistic bounds. The explicit lower bounds on polarization probability constitute a concrete theoretical result that could be leveraged for further analysis of multi-agent language systems.

major comments (2)

[Abstract] Abstract: the central claim that the GMM-RAG system 'mimics certain aspects of experimental simulations of interacting LLMs' is load-bearing for the proxy interpretation of the polarization bounds, yet the manuscript provides only the definitional construction of the model rather than any comparative simulation or analytical correspondence between GMM component drift and LLM response dynamics.
[Markov chain construction] Markov chain construction and polarization bounds: the lower bounds are derived directly from the GMM interaction rules and parameter-exchange mechanism; without an independent demonstration that these rules encode the feedback processes responsible for polarization in LLMs, the bounds remain specific to the chosen GMM mixing rule and do not automatically constrain LLM behavior.

minor comments (2)

Clarify the precise definition of the RAG analogue and the parameter-update rule with explicit equations early in the model section to avoid ambiguity in the subsequent Markov-chain construction.
Add a small illustrative diagram or pseudocode snippet showing one full interaction cycle between two GMMs to improve readability of the proxy construction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and insightful comments, which help clarify the scope and limitations of our proposed proxy. We address each major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the GMM-RAG system 'mimics certain aspects of experimental simulations of interacting LLMs' is load-bearing for the proxy interpretation of the polarization bounds, yet the manuscript provides only the definitional construction of the model rather than any comparative simulation or analytical correspondence between GMM component drift and LLM response dynamics.

Authors: The referee is correct that the mimicry claim rests on the definitional construction rather than on comparative simulations or explicit analytical mappings between GMM component drift and LLM token or embedding dynamics. The manuscript argues for the proxy through the shared structure of iterative generation, data exchange, and RAG-style parameter updates that parallel feedback-driven LLM interactions. To address the load-bearing nature of the claim, we will revise the abstract to specify the 'certain aspects' as the feedback and update mechanisms and add a short paragraph in the discussion section noting that empirical or finer-grained analytical validation remains future work. This revision will make the scope of the proxy interpretation explicit without overstating the current evidence. revision: partial
Referee: [Markov chain construction] Markov chain construction and polarization bounds: the lower bounds are derived directly from the GMM interaction rules and parameter-exchange mechanism; without an independent demonstration that these rules encode the feedback processes responsible for polarization in LLMs, the bounds remain specific to the chosen GMM mixing rule and do not automatically constrain LLM behavior.

Authors: We agree that the lower bounds follow directly from the chosen GMM mixing and exchange rules and that the manuscript does not supply an independent demonstration that these rules replicate the precise feedback processes driving polarization in LLMs. The paper presents the bounds as a rigorous result for the GMM-RAG system and positions the model as a computationally cheap proxy whose relevance to LLMs depends on the fidelity of the analogy. We will revise the manuscript by adding an explicit statement in the discussion that the bounds apply to the defined Markov chain and offer potential insight for LLM systems only to the extent that the proxy captures the relevant feedback dynamics. This addition will prevent any implication of automatic transfer while preserving the theoretical contribution for the proxy model itself. revision: yes

Circularity Check

0 steps flagged

No significant circularity; polarization bounds derived directly from defined Markov chain

full rationale

The paper defines an interacting GMM system with RAG analogue and parameter exchange rules, constructs a Markov chain from those explicit update rules, formalizes polarization on the chain, and proves lower bounds on its polarization probability. These steps form a self-contained mathematical derivation from the model's own transition probabilities rather than reducing to a fitted parameter, self-citation chain, or imported uniqueness result. The 'mimics certain aspects' claim is a modeling assertion about the proxy construction itself, not a load-bearing step that collapses the bounds back into the inputs by definition. No equations or sections exhibit the specific reductions required for circularity flags.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The model relies on standard assumptions about Gaussian mixtures and Markov chains but introduces an analogue to RAG updating whose fidelity to real LLMs is the key unverified link. No explicit free parameters or invented entities are detailed in the abstract.

axioms (1)

domain assumption Gaussian mixture models can be extended with interaction and update rules that preserve key statistical properties while approximating LLM behavior.
Invoked when constructing the interacting system as a proxy.

pith-pipeline@v0.9.0 · 5752 in / 1293 out tokens · 27802 ms · 2026-05-19T11:49:51.415211+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We build a Markov chain from this system of interacting GMMs; formalize and interpret the notion of polarization for such a chain; and prove lower bounds on the probability of polarization.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

UpdateGMM(d, w) computes the new weights of the GMM given the prior weights w and new data d when the means and variances are fixed.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Control Charts for Multi-agent Systems
cs.MA 2026-05 unverdicted novelty 5.0

Adaptive control charts can monitor learning multi-agent systems but are vulnerable to gradual adversarial defection, revealing a fundamental tradeoff between allowing agents to learn and maintaining security against ...

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · cited by 1 Pith paper · 2 internal anchors

[1]

Using large language models to simulate multiple humans and replicate human subject studies

Gati V Aher, Rosa I Arriaga, and Adam Tauman Kalai. “Using large language models to simulate multiple humans and replicate human subject studies”. In: International Conference on Machine Learning. PMLR. 2023, pp. 337–371

work page 2023
[2]

Does ChatGPT resemble humans in language use?

Zhenguang Garry Cai et al. “Does ChatGPT resemble humans in language use?” In: (2023)

work page 2023
[3]

Preventing the immense increase in the life-cycle energy and carbon footprints of llm-powered intelligent chatbots

Peng Jiang et al. “Preventing the immense increase in the life-cycle energy and carbon footprints of llm-powered intelligent chatbots”. In: Engineering 40 (2024), pp. 202–210

work page 2024
[4]

Retrieval-augmented generation for knowledge-intensive nlp tasks

Patrick Lewis et al. “Retrieval-augmented generation for knowledge-intensive nlp tasks”. In: Advances in neural information processing systems 33 (2020), pp. 9459–9474

work page 2020
[5]

Diffusion-lm improves controllable text generation

Xiang Li et al. “Diffusion-lm improves controllable text generation”. In:Advances in neural information processing systems 35 (2022), pp. 4328–4343

work page 2022
[6]

Investigating social alignment via mirroring in a system of interact- ing language models

Harvey McGuinness et al. “Investigating social alignment via mirroring in a system of interact- ing language models”. In: arXiv preprint arXiv:2412.06834 (2024)

work page arXiv 2024
[7]

When Text Embedding Meets Large Language Model: A Comprehensive Survey

Zhijie Nie et al. “When Text Embedding Meets Large Language Model: A Comprehensive Survey”. In: arXiv preprint arXiv:2412.09165 (2024)

work page arXiv 2024
[8]

Nomic Embed: Training a Reproducible Long Context Text Embedder

Zach Nussbaum et al. “Nomic embed: Training a reproducible long context text embedder”. In: arXiv preprint arXiv:2402.01613 (2024)

work page internal anchor Pith review arXiv 2024
[9]

Llama 2: Open Foundation and Fine-Tuned Chat Models

Hugo Touvron et al. “Llama 2: Open foundation and fine-tuned chat models”. In:arXiv preprint arXiv:2307.09288 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[10]

Attention is all you need

Ashish Vaswani et al. “Attention is all you need”. In:Advances in neural information process- ing systems 30 (2017)

work page 2017
[11]

Astute rag: Overcoming imperfect retrieval augmentation and knowledge conflicts for large language models

Fei Wang et al. “Astute rag: Overcoming imperfect retrieval augmentation and knowledge conflicts for large language models”. In: arXiv preprint arXiv:2410.07176 (2024)

work page arXiv 2024
[12]

State space model for new-generation network alternative to transformers: A survey

Xiao Wang et al. “State space model for new-generation network alternative to transformers: A survey”. In: arXiv preprint arXiv:2404.09516 (2024). 11

work page arXiv 2024

[1] [1]

Using large language models to simulate multiple humans and replicate human subject studies

Gati V Aher, Rosa I Arriaga, and Adam Tauman Kalai. “Using large language models to simulate multiple humans and replicate human subject studies”. In: International Conference on Machine Learning. PMLR. 2023, pp. 337–371

work page 2023

[2] [2]

Does ChatGPT resemble humans in language use?

Zhenguang Garry Cai et al. “Does ChatGPT resemble humans in language use?” In: (2023)

work page 2023

[3] [3]

Preventing the immense increase in the life-cycle energy and carbon footprints of llm-powered intelligent chatbots

Peng Jiang et al. “Preventing the immense increase in the life-cycle energy and carbon footprints of llm-powered intelligent chatbots”. In: Engineering 40 (2024), pp. 202–210

work page 2024

[4] [4]

Retrieval-augmented generation for knowledge-intensive nlp tasks

Patrick Lewis et al. “Retrieval-augmented generation for knowledge-intensive nlp tasks”. In: Advances in neural information processing systems 33 (2020), pp. 9459–9474

work page 2020

[5] [5]

Diffusion-lm improves controllable text generation

Xiang Li et al. “Diffusion-lm improves controllable text generation”. In:Advances in neural information processing systems 35 (2022), pp. 4328–4343

work page 2022

[6] [6]

Investigating social alignment via mirroring in a system of interact- ing language models

Harvey McGuinness et al. “Investigating social alignment via mirroring in a system of interact- ing language models”. In: arXiv preprint arXiv:2412.06834 (2024)

work page arXiv 2024

[7] [7]

When Text Embedding Meets Large Language Model: A Comprehensive Survey

Zhijie Nie et al. “When Text Embedding Meets Large Language Model: A Comprehensive Survey”. In: arXiv preprint arXiv:2412.09165 (2024)

work page arXiv 2024

[8] [8]

Nomic Embed: Training a Reproducible Long Context Text Embedder

Zach Nussbaum et al. “Nomic embed: Training a reproducible long context text embedder”. In: arXiv preprint arXiv:2402.01613 (2024)

work page internal anchor Pith review arXiv 2024

[9] [9]

Llama 2: Open Foundation and Fine-Tuned Chat Models

Hugo Touvron et al. “Llama 2: Open foundation and fine-tuned chat models”. In:arXiv preprint arXiv:2307.09288 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023

[10] [10]

Attention is all you need

Ashish Vaswani et al. “Attention is all you need”. In:Advances in neural information process- ing systems 30 (2017)

work page 2017

[11] [11]

Astute rag: Overcoming imperfect retrieval augmentation and knowledge conflicts for large language models

Fei Wang et al. “Astute rag: Overcoming imperfect retrieval augmentation and knowledge conflicts for large language models”. In: arXiv preprint arXiv:2410.07176 (2024)

work page arXiv 2024

[12] [12]

State space model for new-generation network alternative to transformers: A survey

Xiao Wang et al. “State space model for new-generation network alternative to transformers: A survey”. In: arXiv preprint arXiv:2404.09516 (2024). 11

work page arXiv 2024