The authors create the first large-scale dataset and taxonomy of failure modes in multi-agent LLM systems to explain their limited performance gains.
The surprising effectiveness of ppo in cooperative multi-agent games.Advances in Neural Information Processing Systems, 35:24611–24624
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 4representative citing papers
IBTS framework uses influence shaping to improve zero-shot human-machine teaming beyond partner diversity alone, with gains shown in Overcooked-AI simulations and a 30-subject human study.
Interactive IRL is cast as bi-level optimization with an inner loop learning expert rewards and an outer loop learning interaction policies, solved by the convergent BISIRL algorithm.
LeDRL combines a lightweight LLM generating strategy priors from network prompts with a self-attention DRL agent and reflective evaluator, reporting over 17% higher task success rate than baselines in collaborative edge computing.
citing papers explorer
-
LLM-Enhanced Deep Reinforcement Learning for Task Offloading in Collaborative Edge Computing
LeDRL combines a lightweight LLM generating strategy priors from network prompts with a self-attention DRL agent and reflective evaluator, reporting over 17% higher task success rate than baselines in collaborative edge computing.