MAC-Bench is a new adversarial benchmark that converts legal texts into executable scenarios via the SERV pipeline to measure procedural compliance in multi-agent LLM systems using CSR and MG metrics.
Behavioral study of obedience
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 4roles
background 2polarities
background 2representative citing papers
Generative multi-agent systems exhibit emergent collusion and conformity behaviors that cannot be prevented by existing agent-level safeguards.
Randomized Weibull anchors and debiased collective memory with decay and inflection bonuses let agentic AI in 6G cut anchoring, temporal, and confirmation biases, doubling energy savings to 25% and reducing latency by 5x in simulations.
A qualitative study maps emotions exploited by financial scammers and help-seeking needs at different scam stages, identifying risk factors and suggesting design implications for interventions.
citing papers explorer
-
Beyond Goodhart's Law: A Dynamic Benchmark for Evaluating Compliance in Multi-Agent Systems
MAC-Bench is a new adversarial benchmark that converts legal texts into executable scenarios via the SERV pipeline to measure procedural compliance in multi-agent LLM systems using CSR and MG metrics.
-
Emergent Social Intelligence Risks in Generative Multi-Agent Systems
Generative multi-agent systems exhibit emergent collusion and conformity behaviors that cannot be prevented by existing agent-level safeguards.
-
A Tutorial on Cognitive Biases in Agentic AI-Driven 6G Autonomous Networks
Randomized Weibull anchors and debiased collective memory with decay and inflection bonuses let agentic AI in 6G cut anchoring, temporal, and confirmation biases, doubling energy savings to 25% and reducing latency by 5x in simulations.
-
"It didn't feel right but I needed a job so desperately": Understanding People's Emotions & Help Needs During Financial Scams
A qualitative study maps emotions exploited by financial scammers and help-seeking needs at different scam stages, identifying risk factors and suggesting design implications for interventions.