arxiv: 2604.15236 · v1 · submitted 2026-04-16 · 💻 cs.CY · cs.AI

Recognition: unknown

Agentic Microphysics: A Manifesto for Generative AI Safety

Federico Pierucci, Marcantonio Bracale Syrnikov, Marcello Galisai, Matteo Prandi, Piercosma Bisconti

Pith reviewed 2026-05-10 09:38 UTC · model grok-4.3

classification 💻 cs.CY cs.AI

keywords agentic AIAI safetyinteraction dynamicsgenerative safetycollective risksmicro-level analysismulti-agent systemsprotocol conditions

0 comments

The pith

Safety analysis for agentic AI must focus on the micro-level interactions where one agent's output becomes another's input.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that as AI systems gain planning, memory, tool use, and sustained interactions, collective risks emerge from how agents communicate, observe, and influence each other over time. Single-agent or aggregate-only methods miss the specific interaction mechanisms that generate those risks or the variables that control them. It proposes agentic microphysics as the level of analysis for local dynamics under protocol conditions and generative safety as the method of growing risks from micro conditions to find thresholds and interventions. This shift would let researchers explain population behaviors causally and design targeted fixes instead of working from isolated models or overall statistics. A reader would care because it offers a way to move from observing harms to preventing them through structured interaction design.

Core claim

Population-level risks in agentic AI arise from structured interaction among agents through communication, observation, and mutual influence that shape collective behaviour over time. The paper introduces agentic microphysics to define the required level of analysis as local interaction dynamics where one agent's output becomes another's input under specific protocol conditions, paired with generative safety as the methodology of growing phenomena and eliciting risks from those micro-level conditions to identify sufficient mechanisms, detect thresholds, and design effective interventions.

What carries the argument

Agentic microphysics, the level of local interaction dynamics where one agent's output becomes another's input under specific protocol conditions, which carries the argument by linking those dynamics causally to population-level outcomes via the generative safety methodology.

If this is right

Safety interventions become possible by identifying and modifying the protocols that govern agent-to-agent information flow.
Collective dynamics can be explained by tracing chains of outputs turning into inputs across time rather than by averaging agent properties.
Thresholds for emergent risks can be detected through targeted growth of micro-interaction scenarios instead of large-scale observation.
Research priorities shift from evaluating isolated models to mapping and controlling interaction structures.
Design variables at the protocol level become the primary levers for reducing population-level harms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This approach could inform the construction of sandbox environments for multi-agent testing that focus on protocol variations rather than agent internals.
It opens a path to borrow causal inference techniques from fields studying social or biological interactions to validate AI safety claims.
Adoption would change evaluation practices to treat sequences of agent communications as the main data source for risk assessment.
The framework suggests that preventing certain collective harms might require only local protocol constraints without needing global oversight.

Load-bearing premise

Population-level risks in agentic AI primarily arise from structured interactions that can be causally analyzed and intervened upon at the micro level of agent outputs becoming inputs.

What would settle it

A controlled simulation of multiple agents where altering the protocols for how one agent's output feeds into another's input produces no measurable change in the frequency or severity of collective risky behaviors.

Figures

Figures reproduced from arXiv: 2604.15236 by Federico Pierucci, Marcantonio Bracale Syrnikov, Marcello Galisai, Matteo Prandi, Piercosma Bisconti.

**Figure 1.** Figure 1: The generative safety pipeline. Stage 1 identifies macro-level risk phenomena (e.g., collusion, polarization). Stage 2 formulates testable hypotheses about local interaction rules sufficient to generate them. Stage 3 implements these configurations in controlled multi-agent simulations. Stage 4 uses the same environment to test interventions targeting model behavior or interaction architecture. Stage 5 val… view at source ↗

read the original abstract

This paper advances a methodological proposal for safety research in agentic AI. As systems acquire planning, memory, tool use, persistent identity, and sustained interaction, safety can no longer be analysed primarily at the level of the isolated model. Population-level risks arise from structured interaction among agents, through processes of communication, observation, and mutual influence that shape collective behaviour over time. As the object of analysis shifts, a methodological gap emerges. Approaches focused either on single agents or on aggregate outcomes do not identify the interaction-level mechanisms that generate collective risks or the design variables that control them. A framework is required that links local interaction structure to population-level dynamics in a causally explicit way, allowing both explanation and intervention. We introduce two linked concepts. Agentic microphysics defines the level of analysis: local interaction dynamics where one agent's output becomes another's input under specific protocol conditions. Generative safety defines the methodology: growing phenomena and elicit risks from micro-level conditions to identify sufficient mechanisms, detect thresholds, and design effective interventions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This manifesto flags a real gap in multi-agent AI safety analysis but stops at new labels without any modeling or evidence to make the causal claims work.

read the letter

The main thing here is a push to analyze safety at the level of agent-to-agent interactions rather than isolated models or broad aggregates, using the new terms agentic microphysics for local output-as-input dynamics and generative safety for building up risks from micro conditions. The abstract makes a fair case that persistent planning, memory, and mutual influence in agentic systems can create collective behaviors that current approaches miss, and this draws reasonably from complex systems ideas without overclaiming prior results. That observation is the paper's clearest contribution and could help organize thinking as multi-agent setups grow more common. The rest stays conceptual. It defines the two terms and states that a framework linking local structure to population dynamics would enable mechanism identification and interventions, but supplies no state representations, no protocol formalisms, no toy derivations, and no comparison to existing multi-agent tools like network processes or game forms that already track influence. The asserted advantage over single-agent or aggregate methods therefore stays unshown, which matches the stress-test note. This leaves the central claim as an assertion rather than a demonstrated improvement. The work is aimed at researchers who want high-level framing for AI safety in interacting systems and might spark useful discussion in a reading group. It does not contain results or methods worth citing yet. The topic is timely enough that a serious referee could push for concrete modeling or examples in revision, so it deserves peer review rather than a desk reject.

Referee Report

2 major / 2 minor

Summary. The manuscript is a conceptual manifesto arguing that safety analysis for agentic AI systems—those with planning, memory, tool use, and sustained multi-agent interaction—must move beyond single-agent or aggregate-level approaches. It identifies a methodological gap: current methods fail to isolate interaction-level mechanisms generating collective risks or the design variables controlling them. To address this, the authors propose 'Agentic microphysics' as the level of analysis focused on local dynamics where one agent's output becomes another's input under protocol conditions, and 'Generative safety' as the methodology of deriving population-level phenomena, risks, thresholds, and interventions directly from these micro-level conditions in a causally explicit manner.

Significance. If the proposed concepts were developed into an operational framework with concrete modeling tools and validation, the work could meaningfully advance AI safety research by supplying a missing bridge between individual agent behaviors and emergent collective risks. It correctly identifies that interaction protocols and output-as-input dynamics are under-studied relative to single-model alignment or high-level societal impact assessments. As presented, however, the contribution is limited to problem framing and terminology introduction rather than providing falsifiable mechanisms or testable interventions.

major comments (2)

[Abstract and introduction of Agentic microphysics / Generative safety] Abstract and the section introducing the two core concepts: the central claim that single-agent or aggregate approaches 'do not identify the interaction-level mechanisms that generate collective risks or the design variables that control them' is asserted without any comparative analysis, counter-example, or reference to specific limitations in existing multi-agent safety literature. This leaves the asserted gap un-demonstrated and makes the necessity of the new framework difficult to evaluate.
[Definition of Generative safety] The section defining Generative safety: the methodology is described as 'growing phenomena and elicit risks from micro-level conditions to identify sufficient mechanisms, detect thresholds, and design effective interventions,' yet no modeling language, state representation, simulation protocol, dynamical system, game form, or identification strategy is supplied. Without at least a sketch of how local interaction structure maps causally to population dynamics, the claim that the framework enables explanation and intervention cannot be assessed.

minor comments (2)

The manuscript would benefit from a brief discussion of how 'Agentic microphysics' relates to or differs from existing frameworks in multi-agent systems, complex adaptive systems, or network science to improve clarity and avoid potential overlap.
[Abstract] Several sentences, particularly in the abstract, are long and compound; splitting them would improve readability without altering meaning.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and precise comments. We address each major point below, clarifying the scope of the manifesto while outlining targeted revisions to improve substantiation and accessibility of the proposed concepts.

read point-by-point responses

Referee: [Abstract and introduction of Agentic microphysics / Generative safety] Abstract and the section introducing the two core concepts: the central claim that single-agent or aggregate approaches 'do not identify the interaction-level mechanisms that generate collective risks or the design variables that control them' is asserted without any comparative analysis, counter-example, or reference to specific limitations in existing multi-agent safety literature. This leaves the asserted gap un-demonstrated and makes the necessity of the new framework difficult to evaluate.

Authors: We accept that the necessity of the framework is presented as a conceptual observation rather than through detailed comparative analysis in the current draft. The manuscript is structured as a manifesto to identify an emerging methodological gap arising from the transition to agentic systems featuring sustained interaction, memory, and protocol-mediated influence. In revision we will expand the introduction with a concise comparative paragraph that references representative strands of multi-agent safety work (single-agent alignment techniques and aggregate societal-impact assessments) and note their typical omission of interaction-protocol variables as causal levers. We will also insert a short counter-example illustrating a collective risk (e.g., cascading misinformation) that emerges from local output-as-input dynamics but is not isolated by either single-agent or purely aggregate lenses. These additions will make the asserted gap more explicit without altering the manifesto character of the paper. revision: yes
Referee: [Definition of Generative safety] The section defining Generative safety: the methodology is described as 'growing phenomena and elicit risks from micro-level conditions to identify sufficient mechanisms, detect thresholds, and design effective interventions,' yet no modeling language, state representation, simulation protocol, dynamical system, game form, or identification strategy is supplied. Without at least a sketch of how local interaction structure maps causally to population dynamics, the claim that the framework enables explanation and intervention cannot be assessed.

Authors: We agree that the current text supplies only a high-level methodological orientation and does not include an explicit mapping sketch or formal language. Because the paper is a manifesto whose primary aim is to name and motivate the required level of analysis, a complete operationalization lies outside its intended scope. To address the concern directly, we will add a brief illustrative subsection that walks through a minimal multi-agent protocol (e.g., repeated information exchange under a simple visibility rule) and shows, at the level of sufficient conditions, how micro-level output-as-input structure can generate a detectable population threshold and a corresponding intervention point. This example will remain conceptual and will not claim to constitute a full modeling toolkit, but it will allow readers to evaluate the causal directionality asserted by the framework. revision: yes

Circularity Check

0 steps flagged

Conceptual manifesto introduces framework terms without derivations or reductions

full rationale

The paper presents a methodological proposal defining 'agentic microphysics' as the analysis level of local interaction dynamics (one agent's output becoming another's input) and 'generative safety' as the methodology of eliciting risks from micro conditions. It asserts a gap between single-agent/aggregate approaches and the needed causally explicit framework but supplies no equations, fitted parameters, predictions, or first-principles derivations. No self-citations, uniqueness theorems, or ansatzes are invoked to justify central claims. The content is purely definitional and propositional, remaining self-contained with no step that reduces to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The central claim rests on the domain assumption that interaction-level mechanisms generate collective risks and can be linked causally, plus two newly invented conceptual entities introduced without independent evidence.

axioms (1)

domain assumption Safety analysis must shift from isolated models to population-level risks arising from structured agent interactions through communication and mutual influence.
This premise is stated directly in the abstract as the motivation for the new framework.

invented entities (2)

Agentic microphysics no independent evidence
purpose: Defines the level of analysis focused on local interaction dynamics where one agent's output becomes another's input.
Newly coined term presented as the required framework without prior literature or evidence cited.
Generative safety no independent evidence
purpose: Methodology for growing phenomena and eliciting risks from micro-level conditions to identify mechanisms and interventions.
Newly introduced methodology without supporting derivation or validation.

pith-pipeline@v0.9.0 · 5488 in / 1182 out tokens · 64754 ms · 2026-05-10T09:38:30.308123+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

49 extracted references · 18 canonical work pages · 6 internal anchors

[1]

F., Aiello, L

Ashery, A. F., Aiello, L. M., and Baronchelli, A. (2025). Emergent social conventions and collective bias in LLM populations.Science Advances, 11(20), eadu9368

2025
[2]

(1984).The Evolution of Cooperation

Axelrod, R. (1984).The Evolution of Cooperation. New York: Basic Books

1984
[3]

Bisconti, P ., Galisai, M., Pierucci, F., Bracale, M., and Prandi, M. (2025). Beyond single-agent safety: A taxonomy of risks in LLM-to-LLM interactions.arXiv preprint arXiv:2512.02682

work page arXiv 2025
[4]

Bracale, M., et al. (2026). Institutional AI: Governing LLM collusion in multi-agent Cournot markets via public governance graphs.arXiv preprint arXiv:2601.11369

work page arXiv 2026
[5]

and Atwell, J

Bruch, E. and Atwell, J. (2015). Agent-based models in empirical social research.Sociological Methods & Research, 44(2), 186–221

2015
[6]

Why Do Multi-Agent LLM Systems Fail?

Cemri, M., Pan, M. Z., Yang, S., et al. (2025). Why do multi-agent LLM systems fail?arXiv preprint arXiv:2503.13657

work page internal anchor Pith review arXiv 2025
[7]

Herd behavior: Investigating peer influence in llm-based multi-agent systems.arXiv preprint arXiv:2505.21588, 2025

Cho, Y.-M., Guntuku, S. C., and Ungar, L. (2025). Herd behavior: Investigating peer influence in LLM-based multi-agent systems.arXiv preprint arXiv:2505.21588

work page arXiv 2025
[8]

(1965).Aspects of the Theory of Syntax

Chomsky, N. (1965).Aspects of the Theory of Syntax. Cambridge, MA: MIT Press

1965
[9]

(2000).New Horizons in the Study of Language and Mind

Chomsky, N. (2000).New Horizons in the Study of Language and Mind. Cambridge: Cambridge University Press

2000
[10]

Chuang, Y.-S., et al. (2024). Simulating opinion dynamics with networks of LLM-based agents. In Findings of ACL, pp. 3326–3346

2024
[11]

Coleman, J. S. (1990).Foundations of Social Theory. Cambridge, MA: Harvard University Press

1990
[12]

De Marzo, G., Pietronero, L., and Garcia, D. (2023). Emergence of scale-free networks in social interac- tions among large language models.arXiv preprint arXiv:2312.06619. 9

work page arXiv 2023
[13]

Elsenbroich, C. (2012). Explanation in agent-based modelling: Functions, causality or mechanisms? Journal of Artificial Societies and Social Simulation, 15(3), Article 1

2012
[14]

Epstein, J. M. (1999). Agent-based computational models and generative social science.Complexity, 4(5), 41–60

1999
[15]

Epstein, J. M. (2006).Generative Social Science: Studies in Agent-Based Computational Modeling. Princeton: Princeton University Press

2006
[16]

Epstein, J. M. and Axtell, R. (1996).Growing Artificial Societies: Social Science from the Bottom Up. Cam- bridge, MA: MIT Press

1996
[17]

Fontana, N., Pierri, F., and Aiello, L. M. (2025). Nicer than Humans: How Do Large Language Models Behave in the Prisoner’s Dilemma?Proceedings of the International AAAI Conference on Web and Social Media, 19(1), 522–535

2025
[18]

(1977).Discipline and Punish: The Birth of the Prison

Foucault, M. (1977).Discipline and Punish: The Birth of the Prison. New York: Pantheon

1977
[19]

Granovetter, M. (1978). Threshold models of collective behavior.American Journal of Sociology, 83(6), 1420–1443

1978
[20]

Greenblatt, R., Denison, C., Wright, B., et al. (2024). Alignment faking in large language models.arXiv preprint arXiv:2412.14093

work page internal anchor Pith review arXiv 2024
[21]

Guo, T., et al. (2024). Large language model based multi-agents: A survey of progress and challenges. arXiv preprint arXiv:2402.01680

work page internal anchor Pith review arXiv 2024
[22]

Hammond, L., et al. (2025). Multi-agent risks from advanced AI.arXiv preprint arXiv:2502.14143

work page arXiv 2025
[23]

Han, S., Zhang, Q., Yao, Y., Jin, W., Xu, Z., and He, C. (2024). LLM-based multi-agent systems: Challenges and open problems.arXiv preprint arXiv:2402.03578

work page arXiv 2024
[24]

and Ylikoski, P

Hedström, P . and Ylikoski, P . (2010). Causal mechanisms in the social sciences.Annual Review of Sociology, 36, 49–67

2010
[25]

Hubinger, E., van Merwijk, C., Mikulik, V ., Skalse, J., and Garrabrant, S. (2019). Risks from learned optimization in advanced machine learning systems.arXiv preprint arXiv:1906.01820

work page arXiv 2019
[26]

and Törnberg, P

Larooij, M. and Törnberg, P . (2026). Validation is the central challenge for generative social simulation: A critical review of LLMs in agent-based modeling.Artificial Intelligence Review, 59, Article 15

2026
[27]

Liang, Y., et al. (2025). Everyone contributes: Incentivizing strategic cooperation in multi-LLM systems via sequential public goods games.arXiv preprint arXiv:2508.02076

work page arXiv 2025
[28]

Lin, R., et al. (2024). Strategic collusion of LLM agents: Market division in multi-commodity competi- tions.arXiv preprint arXiv:2410.00031

work page arXiv 2024
[29]

Luo, J., Xu, Z., Zhang, S., et al. (2025). Beyond self-talk: A communication-centric survey of LLM-based multi-agent systems.arXiv preprint arXiv:2502.14321

work page arXiv 2025
[30]

Lynch, A., Larson, C., Mindermann, S., et al. (2025). Agentic misalignment: How LLMs could be insider threats.arXiv preprint arXiv:2510.05179

work page arXiv 2025
[31]

Macy, M. W. and Willer, R. (2002). From factors to actors: Computational sociology and agent-based modeling.Annual Review of Sociology, 28, 143–166. 10

2002
[32]

Mahoney, J. (2012). The logic of process tracing tests in the social sciences.Sociological Methods & Research, 41(4), 570–597

2012
[33]

Meinke, A., Schoen, B., Scheurer, J., et al. (2024). Frontier models are capable of in-context scheming. arXiv preprint arXiv:2412.04984

work page arXiv 2024
[34]

Merton, R. K. (1948). The self-fulfilling prophecy.Antioch Review, 8(2), 193–210

1948
[35]

S., O’Brien, J

Park, J. S., O’Brien, J. C., Cai, C. J., Morris, M. R., Liang, P ., and Bernstein, M. S. (2023). Generative agents: Interactive simulacra of human behavior. InProceedings of UIST 2023, pp. 1–22

2023
[36]

Piao, J., et al. (2025). AgentSociety: Large-scale simulation of LLM-driven generative agents advances understanding of human behaviors and society.arXiv preprint arXiv:2502.08691

work page internal anchor Pith review Pith/arXiv arXiv 2025
[37]

and Kaidesoja, T

Pozzoni, G. and Kaidesoja, T. (2021). Context in mechanism-based explanation.Philosophy of the Social Sciences, 51(6), 523–554

2021
[38]

Prandi, M., Pierucci, F., Bisconti Lucidi, P ., Bracale Syrnikov, M., and Galisai, M. (2026). Herd behaviour and attention profiles in LLM multi-agent news-feed environments. Unpublished manuscript, ICARO Lab

2026
[39]

Rizzi, L. (2017). The concept of explanatory adequacy. In I. Roberts (ed.),The Oxford Handbook of Universal Grammar, pp. 97–113. Oxford: Oxford University Press

2017
[40]

Rossetti, G., et al. (2025). Towards operational validation of LLM-agent social simulations: A replicated study of a Reddit-like technology forum.arXiv preprint arXiv:2508.21740

work page internal anchor Pith review Pith/arXiv arXiv 2025
[41]

Schelling, T. C. (1971). Dynamic models of segregation.Journal of Mathematical Sociology, 1(2), 143–186

1971
[42]

Tilly, C. (2001). Mechanisms in political processes.Annual Review of Political Science, 4(1), 21–41

2001
[43]

Tran, K.-T., et al. (2025). Multi-agent collaboration mechanisms: A survey of LLMs.arXiv preprint arXiv:2501.06322

work page internal anchor Pith review arXiv 2025
[44]

Wang, C., Liu, Z., Yang, D., and Chen, X. (2025). Decoding echo chambers: LLM-powered simulations revealing polarization in social networks. InProceedings of COLING 2025, pp. 3913–3923

2025
[45]

(1922).Economy and Society

Weber, M. (1922).Economy and Society. Berkeley: University of California Press. English translation 1978 by G. Roth and C. Wittich

1922
[46]

Weng, Z., Chen, G., and Wang, W. (2025). Do as we do, not as you think: The conformity of large language models.International Conference on Learning Representations (ICLR 2025)

2025
[47]

Ylikoski, P . (2021). Understanding the Coleman boat. In G. Manzo (ed.),Research Handbook on Analytical Sociology, pp. 49–63. Cheltenham: Edward Elgar

2021
[48]

Zhang, K., Yu, X., Peng, H., Yang, Z., Tian, Y., Jin, H., Feng, T., and Lin, H. (2026). Towards efficient optimization of multi-agent social simulation via large language models.International Journal of Machine Learning and Cybernetics, 17, Article 1

2026
[49]

and De Domenico, M

Zomer, N. and De Domenico, M. (2026). Unraveling the emergence of collective behavior in networks of cognitive agents.npj Artificial Intelligence, 2, Article 36. 11

2026