{"total":14,"items":[{"citing_arxiv_id":"2606.09931","ref_index":5,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"A Note on the Strategic Confinement Problem","primary_cat":"cs.GT","submitted_at":"2026-06-07T16:36:58+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"Strategic agents can achieve high-harm outcomes via low-capacity channels by concentrating residual capacity on high-impact predicates of confidential data, so leakage bounds need not bound worst-case harm.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.29434","ref_index":3,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"AliMark: Enhancing Robustness of Sentence-Level Watermarking Against Text Paraphrasing","primary_cat":"cs.CR","submitted_at":"2026-05-28T06:30:43+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"AliMark introduces a two-stage detection strategy with multi-candidate bit sequence alignment to improve robustness of sentence-level text watermarks against paraphrasing attacks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.19240","ref_index":5,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"CASPIAN: Online Detection and Attribution of Cascade Attacks in LLM Multi-Agent Systems via Cross-Channel Causal Monitoring","primary_cat":"cs.MA","submitted_at":"2026-05-19T01:16:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"CASPIAN introduces unified cross-channel causal monitoring via late-interaction conditional transfer entropy to detect cascade onset and attribute origin, bridge, and amplifier agents in LLM multi-agent systems.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.08427","ref_index":10,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"The Attacker in the Mirror: Breaking Self-Consistency in Safety via Anchored Bipolicy Self-Play","primary_cat":"cs.AI","submitted_at":"2026-05-08T19:41:59+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Anchored Bipolicy Self-Play trains role-specific LoRA adapters on a frozen base model to break self-consistency collapse in self-play red-teaming, yielding up to 100x parameter efficiency and stronger safety on Qwen2.5 models.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"1 Introduction While concerns surrounding the safety of NLP models have long been recognised [1], Large Language Models (LLMs) [3], given their capabilities and widespread adoption, pose significant security and safety risks that have the potential to erode trust in these systems [ 19], particularly with the rise of LLM agents and agentic applications [10]. Although safeguarding generated outputs is critical for preventing misuse, recent developments reveal ongoing and substantial obstacles to achieving dependable LLM safety [ 37]. Model patching to prevent jailbreaks constitutes one mitigation strategy [36]. More recently, self-play reinforcement learning, which engages multiple models in iterative attack-and-defence cycles, has shown promise in improving model robustness [22]."},{"citing_arxiv_id":"2605.06738","ref_index":14,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"From Specification to Deployment: Empirical Evidence from a W3C VC + DID Trust Infrastructure for Autonomous Agents","primary_cat":"cs.CR","submitted_at":"2026-05-07T14:09:51+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"MolTrust deploys a W3C VC+DID trust infrastructure for AI agents with kernel-layer authorization, cross-protocol interoperability, and layered Sybil resistance, operational since March 2026 across eight verticals.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"evidence against independent verifier implementations, and layered Sybil resistance. Each of these distinguishing mechanisms is independently verifiable against the live endpoint and the published conformance specification.This deployment-first contribu- tion is complementary to, and validates, a growing body of academic literature on multi-agent security [14], inter-agent trust models [15], governance architectures [16], [17], conceptual DID-and-VC frameworks for agents [36], and threat taxonomies [18], [19], [20]. The remainder of this paper is organized as follows. Section 2 positions MolTrust within the regulatory context (NIST, IMDA, EU AI Act), the industry framework convergence (Anthropic, Google, Microsoft), and the academic literature."},{"citing_arxiv_id":"2605.00073","ref_index":3,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"AgentReputation: A Decentralized Agentic AI Reputation Framework","primary_cat":"cs.AI","submitted_at":"2026-04-30T12:33:39+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"AgentReputation proposes separating AI agent task execution, reputation management, and secure record-keeping into distinct layers, with context-specific reputation cards and a risk-based policy engine to handle verification in decentralized settings.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Section 4 outlines open research challenges. Section 5 concludes the paper. 2 Related Work Emerging research in trust and reputation in LLM in SE identifies several trust challenges [14]. Niu et al. [19] demonstrate that multi- agent LLMs can learn deceptive behaviors to manipulate evaluators. arXiv:2605.00073v1 [cs.AI] 30 Apr 2026 Chishti et al. de Witt [3] surveys security gaps in multi-agent systems without centralized oversight and observed that collusion, weak attribution, and systematic cascades make the worst-case behavior difficult to contain without governance mechanisms. Park et al. [20] doc- ument AI deception risks and potential solutions. However, these works mostly characterize threats and do not discuss an operational"},{"citing_arxiv_id":"2604.19211","ref_index":21,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"ClawNet: Human-Symbiotic Agent Network for Cross-User Autonomous Cooperation","primary_cat":"cs.AI","submitted_at":"2026-04-21T08:15:05+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"ClawNet digitizes human collaborative relationships into a network of identity-governed AI agents that collaborate on behalf of their owners through a central orchestrator enforcing binding and verification.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.15367","ref_index":62,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"SoK: Security of Autonomous LLM Agents in Agentic Commerce","primary_cat":"cs.CR","submitted_at":"2026-04-15T01:55:40+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"The paper systematizes security for LLM agents in agentic commerce into five threat dimensions, identifies 12 cross-layer attack vectors, and proposes a layered defense architecture.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"[54]; Cai et al. [55] T2T Tool-to-Transaction Acharya [18]; Shittu [56]; Ruan et al. [57] Deng et al. [58] P2K Prompt-to-Key Acharya [18] Steinberger [3]; Luo et al. [4]; Rizinski and Trajanov [5] C2E Collusion-to-Escrow Virtuals Protocol [30]; Crapis et al. [35] Liu et al. [11]; Yu et al. [59]; de Witt [60]; Hu and Rong [61] O2P Oracle-to-Position Moreno [62]; Assis et al. [63] Nabar and Shroff [64]; Kim et al. [65] N2C Neg-to-Compliance Faysal et al. [66] Liu et al. [11]; Allouah et al. [51]; Hornuf et al. [67] I2M Identity-to-Market Xu et al. [68] No released derived-support evidence in the current snapshot. S2I Supply-to-Integrity Model Context Protocol [43]; Ruan et al. [57] Maloyan and Namiot [49]; Zhang et al."},{"citing_arxiv_id":"2604.04604","ref_index":102,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"AI Agents Under EU Law","primary_cat":"cs.CY","submitted_at":"2026-04-06T11:47:38+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"AI agent providers face an exhaustive inventory requirement for actions and data flows, as high-risk systems with untraceable behavioral drift cannot meet the AI Act's essential requirements.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"The US regulatory landscape presents a structurally distinct compliance topology: no unified AI statute, but a dense web of federal conduct-based enforcement (FTC Section 5, with the December 2025 set-aside of theRytrconsent order signalling a retreat from instrumentalities liability for neutral AI tools [ 55]; Selbst & Barocas provide the foundational analysis of how FTC unfairness authority applies to AI [ 102]), sector-specific instruments (the 2025 COPPA amendments imposing separate parental consent for AI model training [54]; the FCC's 2024 declaratory ruling classifying AI-generated voices as \"artificial\" under the TCPA [52]; the FDA's Predetermined Change Control Plan pathway for adaptive medical AI [ 111]), voluntary standards (the NIST AI RMF [ 92] and its"},{"citing_arxiv_id":"2605.02900","ref_index":74,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses","primary_cat":"cs.CR","submitted_at":"2026-03-28T13:21:44+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"showedthatself-interestedagentsinmulti-robotplanningtaskslearnmanipulativecommunicationstrategies through a differentiable shared channel, suggesting that adversarial behavior may emerge naturally from competitive pressure. He et al. [126] introduced the Agent-in-the-Middle (AiTM) attack that intercepts and manipulates messages between LLM-based agents during cooperative planning. Schroeder de Witt [74] taxonomizes multi-agent security threats including cascading failures, monoculture collapse, and conformity bias that drives false consensus on unsafe plans. 4.3.2 Goal Conflicts Adversarial or self-interested agents exploit cooperative planning protocols to advance conflicting objectives. Choudhuryetal.[ 70]formulatedrobusttaskallocationstrategiesthatmaintainplanqualityunderadversarial"},{"citing_arxiv_id":"2603.09002","ref_index":50,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Security Considerations for Multi-agent Systems","primary_cat":"cs.CR","submitted_at":"2026-03-09T22:46:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"No existing AI security framework covers a majority of the 193 identified multi-agent system threats in any category, with OWASP Agentic Security Initiative achieving the highest overall coverage at 65.3%.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2510.23883","ref_index":147,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges","primary_cat":"cs.AI","submitted_at":"2025-10-27T21:48:11+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"A survey that taxonomizes threats to agentic AI, reviews benchmarks and evaluation methods, discusses technical and governance defenses, and identifies open challenges.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Multi-agent security tax: Trading off security and collaboration capabilities in multi- agent systems. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 27573-27581, 2025. [146] Satbir Singh. Llm -based agents: The benefits and the risks. https://www.enkryptai.com/blog/ llmâĂŚagentsâĂŚbenefitsâĂŚrisks, February 2025. Accessed: 2025-08-21. [147] Christian Schroeder de Witt. Open challenges in multi-agent security: Towards secure systems of interacting ai agents.arXiv preprint arXiv:2505.02077, 2025. [148] Hadi Askari, Shivanshu Gupta, Fei Wang, Anshuman Chhabra, and Muhao Chen. LayerIF: Estimating layer quality for large language models using influence functions.NeurIPS, 2025. [149] Hadi Askari, Shivanshu Gupta, Terry Tong, Fei Wang, Anshuman Chhabra, and Muhao Chen."},{"citing_arxiv_id":"2510.14133","ref_index":6,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Formalizing the Safety, Security, and Functional Properties of Agentic AI Systems","primary_cat":"cs.AI","submitted_at":"2025-10-15T22:02:30+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Introduces host agent and task lifecycle models plus 30 temporal logic properties to enable formal verification of liveness, safety, completeness, and fairness in agentic AI systems.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2510.12826","ref_index":43,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Scheming Ability in LLM-to-LLM Strategic Interactions","primary_cat":"cs.CL","submitted_at":"2025-10-11T04:42:29+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Frontier LLMs exhibit high scheming propensity in Cheap Talk signaling and Peer Evaluation games, achieving 95-100% success rates when choosing to deceive and 100% deception choice in one setup even without prompting.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}