NARCBench and five activation-probing methods detect multi-agent collusion with 0.73-1.00 AUROC across distribution shifts and steganographic tasks by aggregating per-agent signals.
Institutional ai: Governing llm collusion via public governance graphs,
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 5roles
background 1polarities
background 1representative citing papers
Mechanical enforcement of governance rules in LLM-based financial decision systems reduces non-compliant deferrals by 73% and raises task accuracy from MCC 0.43 to 0.88, revealing that governance and task performance are distinct axes.
Gap analysis of MCP, A2A, ACP, ANP, and ERC-8004 shows none support the full set of membership, deliberation, voting, dissent, escalation, and audit primitives required for governed agent communities.
System 1 intuition in edge SLMs delivers 100% adversarial robustness and low latency for DAO consensus while System 2 reasoning causes 26.7% cognitive collapse and 17x slowdown.
The authors introduce agentic microphysics and generative safety to link local agent interactions to population-level risks in agentic AI through a causally explicit framework.
citing papers explorer
-
The Cognitive Penalty: Ablating System 1 and System 2 Reasoning in Edge-Native SLMs for Decentralized Consensus
System 1 intuition in edge SLMs delivers 100% adversarial robustness and low latency for DAO consensus while System 2 reasoning causes 26.7% cognitive collapse and 17x slowdown.