Emergent Culture in Minimal LLM Systems

Sabine Hauert; Simon Jones

arxiv: 2606.30668 · v1 · pith:T2IAZPZ2new · submitted 2026-06-21 · 💻 cs.NE · cs.AI· cs.CL· cs.MA· nlin.AO· q-bio.PE

Emergent Culture in Minimal LLM Systems

Simon Jones , Sabine Hauert This is my paper

Pith reviewed 2026-07-01 07:16 UTC · model grok-4.3

classification 💻 cs.NE cs.AIcs.CLcs.MAnlin.AOq-bio.PE

keywords emergent cultureLLM agentsmulti-agent systemsdecaying text storeevolutionary pressuredynamical systems analysiscooperationstorage management

0 comments

The pith

Collectives of three LLM agents spontaneously generate complex evolving cultural artifacts with long-range coherence in a shared decaying store.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that LLM agents given only per-turn context, minimal prompts, and basic tools for messaging and store manipulation will, under evolutionary pressure, form collectives that cooperate, invent storage strategies, and produce structured cultural artifacts. These artifacts display coherence that persists beyond the decay limit of the shared text store, matching the definition of emergent culture. A reader would care because the setup strips away most engineered scaffolding, isolating whether basic interaction rules plus selection pressure suffice for cultural emergence.

Core claim

When three agents operate with no persistent context, minimal prompting, and simple tools for sending messages and editing a shared actively decaying text store under evolutionary pressure, they spontaneously cooperate, develop storage management strategies, and generate complex evolving cultural artifacts whose structured long-range coherence exceeds the entropy horizon of the decaying store, consistent with emergent culture in the Sperberian sense.

What carries the argument

The shared actively decaying text store together with evolutionary pressure applied to three agents' message-sending and manipulation actions.

If this is right

Storage management strategies appear without explicit top-down design.
Cooperative behaviors emerge from the interaction rules alone.
Cultural artifacts maintain structured coherence across time scales longer than individual message lifetimes.
The resulting patterns align with the Sperberian definition of emergent culture.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the coherence truly exceeds the store's entropy horizon, similar minimal collectives could be used to test whether cultural transmission requires only selection on communication rather than internal memory.
The setup isolates whether evolutionary pressure on message utility can produce persistent shared knowledge without persistent individual state.
Removing the decaying store or the three-agent limit would test which elements are necessary for the observed coherence.

Load-bearing premise

The long-range coherence and cultural artifacts result from the minimal interaction rules and evolutionary pressure rather than from hidden capabilities inside the LLMs or from the specific tool implementations.

What would settle it

A run in which the measured long-range coherence of the generated artifacts falls within the entropy horizon expected from the decaying store alone, or in which the artifacts disappear when the evolutionary pressure is removed while keeping all other rules fixed.

Figures

Figures reproduced from arXiv: 2606.30668 by Sabine Hauert, Simon Jones.

**Figure 2.** Figure 2: Three agent 100 cycle runs. Top Total storage size. Claude (ckg) runs are wordier than non-Claude (kkg), despite the limited character budget. Centre The proportion of the characters in storage that are corrupted rises rapidly as entropy takes effect, then stabilises and declines slightly. Bottom The number of storage keys is higher in Claude runs, compared to non-Claude, but the number of keys accessed p… view at source ↗

**Figure 3.** Figure 3: Left: Agents context window input comes from listing storage keys (rot-free), from storage and their inbox (rot-prone). The effective context window is 15 cycles long, but the distribution is front-loaded, so the largest proportions of characters are from within-cycle, or from the previous cycle, with a long tail of older characters up to the entropy horizon. Right: Agents always develop a cycle-keyed stra… view at source ↗

**Figure 4.** Figure 4: Left: Low level dataflow behaviour clusters broadly into three strategies ‘Accumulators’, ‘Writers’, and ‘Explorers’, shown by PCA of the three factors plist keys, pstore rot, and pdiscard shown. Right: Interesting terms written to storage spanning at least 20 cycles and appearing at least 10 times, with no gap between uses more than 10 cycles. Point diameter proportional to span coverage. Names show clear… view at source ↗

**Figure 5.** Figure 5: Recurrence Plots of the semantic embedding trajec [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Left: RQA analysis of text. DET vs ENTR of ‘slushpile’, ‘literature’, and swarm generated text. Diameter of points is proportional to Lmax. Human-generated text has particular characteristics, with ‘literature’ characterised by a much wider range than ‘slushpile’ but both occupying the same region. The LLM generated text has quite different character, with very large Lmax compared to human writing, and gre… view at source ↗

read the original abstract

What happens when LLM agents operate with no context outside a turn, minimal prompting, and simple tools? Inspired by swarm engineering, we give collectives of three agents the ability to send messages and manipulate a shared actively decaying text store, introducing evolutionary pressure. The agents spontaneously cooperate, develop storage management strategies, and generate complex evolving cultural artifacts, with no top-down engineering. Using tools from dynamical systems analysis, we show that these behaviours exhibit structured long-range coherence beyond the entropy horizon of the decaying store, consistent with emergent culture in the Sperberian sense.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper offers a minimal LLM agent setup for studying emergent culture, but lacks controls to separate model priors from the claimed dynamics.

read the letter

The main takeaway is a new minimal experimental system: three turn-limited LLM agents that can only send messages and edit a shared text store with active decay, plus evolutionary pressure. The abstract reports spontaneous cooperation, storage strategies, and complex artifacts that show structured long-range coherence past the decay horizon, framed as Sperber-style cultural emergence.

The setup itself is the clearest contribution. Combining a decaying shared store with evolutionary selection on small collectives is a concrete way to force ongoing transmission and management without heavy prompting or persistent context. The use of dynamical systems measures to quantify coherence is a reasonable choice for the claim.

The load-bearing weakness is the one flagged in the stress test. LLMs carry extensive cultural and cooperative patterns from pretraining, so the observed behaviors could be downstream of those priors rather than produced by the three-agent rules and decay alone. The abstract gives no sign of ablations (non-LLM agents, frozen weights, or simpler baselines) that would isolate the contribution of the minimal system. Without those, the central claim that the coherence arises strictly from the interaction rules remains untested. The coherence metric also needs to be shown independent of the decay rate and evolutionary parameters; circularity there would undermine the result.

This is for people working on multi-agent LLM simulations or cultural evolution models who want a lightweight platform. A reader could extract the setup and try it even if the interpretation stays provisional. It deserves a serious referee because the experimental frame is simple and novel enough to be worth checking, provided the full paper supplies the missing controls, methods, and data. I would send it to review with a request for those ablations and reproducibility details.

Referee Report

2 major / 2 minor

Summary. The manuscript describes an experiment with collectives of three LLM agents that have no persistent context, use minimal prompting, and interact via message sending and manipulation of a shared actively decaying text store, subject to evolutionary pressure. The authors report that the agents spontaneously cooperate, develop storage management strategies, and produce complex evolving cultural artifacts. Dynamical systems analysis is used to demonstrate structured long-range coherence in these behaviors that exceeds the entropy horizon of the decaying store, which the authors interpret as consistent with emergent culture in the Sperberian sense.

Significance. If the central claim is supported by rigorous controls and detailed methods showing that the coherence and artifacts arise strictly from the minimal rules rather than LLM pretraining or tool specifics, this work could be significant for the study of multi-agent systems and cultural evolution. The integration of dynamical systems analysis to measure long-range coherence is a positive aspect that could provide a quantitative framework for such studies.

major comments (2)

[Abstract] The assertion that the long-range coherence and cultural artifacts emerge from the minimal interaction rules and evolutionary pressure is central, but the manuscript does not provide evidence ruling out contributions from the LLMs' pretraining on cultural corpora. This is load-bearing because without ablation studies (e.g., using non-LLM agents), the results may not support the claim of emergence in a 'minimal' system.
[Dynamical systems analysis section] The coherence metric's independence from the decay rate and evolutionary pressure parameters is not demonstrated; if the metric is constructed in a way that depends on these choices, the claim of structured coherence beyond the entropy horizon risks circularity.

minor comments (2)

The abstract mentions 'tools from dynamical systems analysis' but does not specify which tools (e.g., correlation functions, Lyapunov exponents); adding this would improve clarity.
A reference to Sperber's work on cultural evolution should be included to ground the interpretation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The comments highlight important issues regarding the interpretation of emergence and the robustness of the dynamical analysis. We address each point below and indicate the revisions we will make.

read point-by-point responses

Referee: [Abstract] The assertion that the long-range coherence and cultural artifacts emerge from the minimal interaction rules and evolutionary pressure is central, but the manuscript does not provide evidence ruling out contributions from the LLMs' pretraining on cultural corpora. This is load-bearing because without ablation studies (e.g., using non-LLM agents), the results may not support the claim of emergence in a 'minimal' system.

Authors: We agree that the claim of emergence from the interaction rules alone would be strengthened by controls that isolate pretraining effects. The experimental design deliberately uses single-turn context, minimal system prompts, and no persistent agent memory to constrain the role of pretrained knowledge, with all cultural content required to be actively maintained in the shared decaying store. Nevertheless, we acknowledge that this does not constitute a full ablation. In the revised manuscript we will expand the Methods and Discussion sections to quantify the constraints on pretraining use and add an explicit Limitations paragraph noting the absence of non-LLM baselines as an important direction for future work. revision: partial
Referee: [Dynamical systems analysis section] The coherence metric's independence from the decay rate and evolutionary pressure parameters is not demonstrated; if the metric is constructed in a way that depends on these choices, the claim of structured coherence beyond the entropy horizon risks circularity.

Authors: The coherence metric is computed from the decay of mutual information and autocorrelation structure in the time series of store entropy and inter-agent message content; the decay rate and evolutionary fitness function enter only as external parameters that shape the observed trajectories, not as inputs to the metric itself. To remove any ambiguity we will add an appendix containing parameter sweeps over decay rates and selection strengths, confirming that the reported long-range coherence signature remains qualitatively unchanged across these variations. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical observations from minimal agent rules

full rationale

The paper describes an experimental setup with three LLM agents using message-passing and a decaying shared store under evolutionary pressure, then reports observed spontaneous cooperation, storage strategies, and long-range coherence via dynamical systems analysis. No equations, fitted parameters renamed as predictions, or self-citations are invoked to derive the central claim. The Sperberian consistency is an external interpretive reference, not a load-bearing self-definition or reduction. The result is an empirical finding from the described minimal system rather than a derivation that collapses to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the domain assumption that the minimal setup plus evolutionary pressure is sufficient to produce the reported behaviors and that the dynamical systems analysis correctly identifies cultural emergence; no free parameters or invented entities are mentioned in the abstract.

axioms (2)

domain assumption Agents operate with no context outside a turn, minimal prompting, and simple tools.
Stated directly in the abstract as the experimental condition.
domain assumption The observed behaviors exhibit structured long-range coherence consistent with emergent culture in the Sperberian sense.
The interpretive claim linking dynamical systems results to cultural emergence.

pith-pipeline@v0.9.1-grok · 5620 in / 1448 out tokens · 44106 ms · 2026-07-01T07:16:21.961327+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

26 extracted references · 3 canonical work pages · 2 internal anchors

[1]

Bai, Y., Zhang, J., Lv, X., Zheng, L., Zhu, S., Hou, L., Dong, Y., Tang, J., and Li, J. (2024). Longwriter: Unleashing 10,000+ word generation from long context LLM s. arXiv preprint arXiv:2408.07055

work page arXiv 2024
[2]

Brambilla, M., Ferrante, E., Birattari, M., and Dorigo, M. (2013). Swarm robotics: a review from the swarm engineering perspective. Swarm Intelligence , 7(1):1--41

2013
[3]

C., and Sperber, D

Claidi \`e re, N., Scott-Phillips, T. C., and Sperber, D. (2014). How darwinian is cultural evolution? Philosophical Transactions of the Royal Society B: Biological Sciences , 369(1642)

2014
[4]

Dawkins, R. (1976). The Selfish Gene . Oxford University Press, Oxford

1976
[5]

O., and Ruelle, D

Eckmann, J.-P., Kamphorst, S. O., and Ruelle, D. (1987). Recurrence plots of dynamical systems. Europhysics Letters , 4(9):973--977

1987
[6]

Giorgi, J., Nitski, O., Wang, B., and Bader, G. (2021). Declutr: Deep contrastive learning for unsupervised textual representations. In Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers) , pages 879--895

2021
[7]

R., Yadav, A

Gorle, A. R., Yadav, A. K. S., and Weissman, T. (2025). Quantifying information gain and redundancy in multi-turn LLM conversations. In First Workshop on Multi-Turn Interactions in Large Language Models

2025
[8]

Grass \'e , P.-P. (1959). La reconstruction du nid et les coordinations interindividuelles chezbellicositermes natalensis etcubitermes sp. la th \'e orie de la stigmergie: Essai d'interpr \'e tation du comportement des termites constructeurs. Insectes sociaux , 6(1):41--80

1959
[9]

Heylighen, F. (2016). Stigmergy as a universal coordination mechanism i: Definition and components. Cognitive Systems Research , 38:4--13

2016
[10]

Hong, K., Troynikov, A., and Huber, J. (2025). Context rot: How increasing input tokens impacts llm performance. Technical report, Chroma

2025
[11]

Hong, S., Zhuge, M., Chen, J., Zheng, X., Cheng, Y., Wang, J., Zhang, C., Wang, Z., Yau, S. K. S., Lin, Z., et al. (2023). MetaGPT : Meta programming for a multi-agent collaborative framework. In The twelfth international conference on learning representations

2023
[12]

Li, G., Hammoud, H., Itani, H., Khizbullin, D., and Ghanem, B. (2023). Camel: Communicative agents for "mind" exploration of large language model society. Advances in neural information processing systems , 36:51991--52008

2023
[13]

Li, R., Zhao, X., and Moens, M.-F. (2022). A brief overview of universal sentence representation methods: A linguistic view. ACM Computing Surveys (CSUR) , 55(3):1--42

2022
[14]

Lin, Z., Feng, M., Santos, C. N. d., Yu, M., Xiang, B., Zhou, B., and Bengio, Y. (2017). A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130

work page internal anchor Pith review Pith/arXiv arXiv 2017
[15]

Liu, X., Dong, P., Hu, X., and Chu, X. (2024). Longgenbench: Long-context generation benchmark. In Findings of the Association for Computational Linguistics: EMNLP 2024 , pages 865--883

2024
[16]

C., Thiel, M., and Kurths, J

Marwan, N., Romano, M. C., Thiel, M., and Kurths, J. (2007). Recurrence plots for the analysis of complex systems. Physics reports , 438(5-6):237--329

2007
[17]

Packer, C., Fang, V., Patil, S., Lin, K., Wooders, S., and Gonzalez, J. (2023). MemGPT : towards LLM s as operating systems

2023
[18]

S., O'Brien, J., Cai, C

Park, J. S., O'Brien, J., Cai, C. J., Morris, M. R., Liang, P., and Bernstein, M. S. (2023). Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th annual acm symposium on user interface software and technology , pages 1--22

2023
[19]

Perez, J., Kova c , G., L \'e ger, C., Colas, C., Molinaro, G., Derex, M., Oudeyer, P.-Y., and Moulin-Frier, C. (2026). When LLM s play the telephone game: Cultural attractors as conceptual tools to evaluate LLM s in multi-turn settings

2026
[20]

Prigogine, I. (1978). Time, structure, and fluctuations. Science , 201(4358):777--785

1978
[21]

Speer, R. (2022). Wordfreq: A library for word frequencies, based on many sources

2022
[22]

Sperber, D. (1996). Explaining culture: A naturalistic approach , volume 21. Oxford Blackwell

1996
[23]

T., Liu, H., Liu, T., Wang, C., Liu, T., Zhang, Y., Shipman, F., et al

Teleki, M., Bengali, V., Dong, X., Janjur, S. T., Liu, H., Liu, T., Wang, C., Liu, T., Zhang, Y., Shipman, F., et al. (2025). A survey on LLM s for story generation. In Findings of the Association for Computational Linguistics: EMNLP 2025 , pages 13954--13966

2025
[24]

Wallace, E., Xiao, K., Leike, R., Weng, L., Heidecke, J., and Beutel, A. (2024). The instruction hierarchy: Training LLM s to prioritize privileged instructions. arXiv preprint arXiv:2404.13208

work page internal anchor Pith review Pith/arXiv arXiv 2024
[25]

Wang, Q., Hu, J., Li, Z., Wang, Y., Li, D., Hu, Y., and Tan, M. (2025). Generating long-form story using dynamic hierarchical outlining with memory-enhancement. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages 1352--1391

2025
[26]

Zbilut, J. P. and Webber Jr, C. L. (1992). Embeddings and delays as derived from quantification of recurrence plots. Physics letters A , 171(3-4):199--203

1992

[1] [1]

Bai, Y., Zhang, J., Lv, X., Zheng, L., Zhu, S., Hou, L., Dong, Y., Tang, J., and Li, J. (2024). Longwriter: Unleashing 10,000+ word generation from long context LLM s. arXiv preprint arXiv:2408.07055

work page arXiv 2024

[2] [2]

Brambilla, M., Ferrante, E., Birattari, M., and Dorigo, M. (2013). Swarm robotics: a review from the swarm engineering perspective. Swarm Intelligence , 7(1):1--41

2013

[3] [3]

C., and Sperber, D

Claidi \`e re, N., Scott-Phillips, T. C., and Sperber, D. (2014). How darwinian is cultural evolution? Philosophical Transactions of the Royal Society B: Biological Sciences , 369(1642)

2014

[4] [4]

Dawkins, R. (1976). The Selfish Gene . Oxford University Press, Oxford

1976

[5] [5]

O., and Ruelle, D

Eckmann, J.-P., Kamphorst, S. O., and Ruelle, D. (1987). Recurrence plots of dynamical systems. Europhysics Letters , 4(9):973--977

1987

[6] [6]

Giorgi, J., Nitski, O., Wang, B., and Bader, G. (2021). Declutr: Deep contrastive learning for unsupervised textual representations. In Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers) , pages 879--895

2021

[7] [7]

R., Yadav, A

Gorle, A. R., Yadav, A. K. S., and Weissman, T. (2025). Quantifying information gain and redundancy in multi-turn LLM conversations. In First Workshop on Multi-Turn Interactions in Large Language Models

2025

[8] [8]

Grass \'e , P.-P. (1959). La reconstruction du nid et les coordinations interindividuelles chezbellicositermes natalensis etcubitermes sp. la th \'e orie de la stigmergie: Essai d'interpr \'e tation du comportement des termites constructeurs. Insectes sociaux , 6(1):41--80

1959

[9] [9]

Heylighen, F. (2016). Stigmergy as a universal coordination mechanism i: Definition and components. Cognitive Systems Research , 38:4--13

2016

[10] [10]

Hong, K., Troynikov, A., and Huber, J. (2025). Context rot: How increasing input tokens impacts llm performance. Technical report, Chroma

2025

[11] [11]

Hong, S., Zhuge, M., Chen, J., Zheng, X., Cheng, Y., Wang, J., Zhang, C., Wang, Z., Yau, S. K. S., Lin, Z., et al. (2023). MetaGPT : Meta programming for a multi-agent collaborative framework. In The twelfth international conference on learning representations

2023

[12] [12]

Li, G., Hammoud, H., Itani, H., Khizbullin, D., and Ghanem, B. (2023). Camel: Communicative agents for "mind" exploration of large language model society. Advances in neural information processing systems , 36:51991--52008

2023

[13] [13]

Li, R., Zhao, X., and Moens, M.-F. (2022). A brief overview of universal sentence representation methods: A linguistic view. ACM Computing Surveys (CSUR) , 55(3):1--42

2022

[14] [14]

Lin, Z., Feng, M., Santos, C. N. d., Yu, M., Xiang, B., Zhou, B., and Bengio, Y. (2017). A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130

work page internal anchor Pith review Pith/arXiv arXiv 2017

[15] [15]

Liu, X., Dong, P., Hu, X., and Chu, X. (2024). Longgenbench: Long-context generation benchmark. In Findings of the Association for Computational Linguistics: EMNLP 2024 , pages 865--883

2024

[16] [16]

C., Thiel, M., and Kurths, J

Marwan, N., Romano, M. C., Thiel, M., and Kurths, J. (2007). Recurrence plots for the analysis of complex systems. Physics reports , 438(5-6):237--329

2007

[17] [17]

Packer, C., Fang, V., Patil, S., Lin, K., Wooders, S., and Gonzalez, J. (2023). MemGPT : towards LLM s as operating systems

2023

[18] [18]

S., O'Brien, J., Cai, C

Park, J. S., O'Brien, J., Cai, C. J., Morris, M. R., Liang, P., and Bernstein, M. S. (2023). Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th annual acm symposium on user interface software and technology , pages 1--22

2023

[19] [19]

Perez, J., Kova c , G., L \'e ger, C., Colas, C., Molinaro, G., Derex, M., Oudeyer, P.-Y., and Moulin-Frier, C. (2026). When LLM s play the telephone game: Cultural attractors as conceptual tools to evaluate LLM s in multi-turn settings

2026

[20] [20]

Prigogine, I. (1978). Time, structure, and fluctuations. Science , 201(4358):777--785

1978

[21] [21]

Speer, R. (2022). Wordfreq: A library for word frequencies, based on many sources

2022

[22] [22]

Sperber, D. (1996). Explaining culture: A naturalistic approach , volume 21. Oxford Blackwell

1996

[23] [23]

T., Liu, H., Liu, T., Wang, C., Liu, T., Zhang, Y., Shipman, F., et al

Teleki, M., Bengali, V., Dong, X., Janjur, S. T., Liu, H., Liu, T., Wang, C., Liu, T., Zhang, Y., Shipman, F., et al. (2025). A survey on LLM s for story generation. In Findings of the Association for Computational Linguistics: EMNLP 2025 , pages 13954--13966

2025

[24] [24]

Wallace, E., Xiao, K., Leike, R., Weng, L., Heidecke, J., and Beutel, A. (2024). The instruction hierarchy: Training LLM s to prioritize privileged instructions. arXiv preprint arXiv:2404.13208

work page internal anchor Pith review Pith/arXiv arXiv 2024

[25] [25]

Wang, Q., Hu, J., Li, Z., Wang, Y., Li, D., Hu, Y., and Tan, M. (2025). Generating long-form story using dynamic hierarchical outlining with memory-enhancement. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages 1352--1391

2025

[26] [26]

Zbilut, J. P. and Webber Jr, C. L. (1992). Embeddings and delays as derived from quantification of recurrence plots. Physics letters A , 171(3-4):199--203

1992