pith. machine review for the scientific record. sign in

Battleagentbench: A benchmark for evaluating cooperation and competition capabilities of language models in multi-agent systems

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

fields

cs.AI 3 cs.MA 1

years

2026 3 2025 1

representative citing papers

Why Do Multi-Agent LLM Systems Fail?

cs.AI · 2025-03-17 · unverdicted · novelty 8.0

The authors create the first large-scale dataset and taxonomy of failure modes in multi-agent LLM systems to explain their limited performance gains.

citing papers explorer

Showing 4 of 4 citing papers.