The surprising effectiveness of ppo in cooperative multi-agent games.Advances in Neural Information Processing Systems, 35:24611–24624

Chao Yu, Akash Velu, Eugene Vinitsky, Jiaxuan Gao, Yu Wang, Alexandre Bayen, Yi Wu · 2022

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

citation-role summary

background 1 other 1

citation-polarity summary

background 1 unclear 1

representative citing papers

Why Do Multi-Agent LLM Systems Fail?

cs.AI · 2025-03-17 · unverdicted · novelty 8.0

The authors create the first large-scale dataset and taxonomy of failure modes in multi-agent LLM systems to explain their limited performance gains.

Beyond Partner Diversity: An Influence-Based Team Steering Framework for Zero-Shot Human-Machine Teaming

cs.AI · 2026-05-14 · unverdicted · novelty 6.0

IBTS framework uses influence shaping to improve zero-shot human-machine teaming beyond partner diversity alone, with gains shown in Overcooked-AI simulations and a 30-subject human study.

Interactive Inverse Reinforcement Learning of Interaction Scenarios via Bi-level Optimization

cs.LG · 2026-05-01 · unverdicted · novelty 6.0

Interactive IRL is cast as bi-level optimization with an inner loop learning expert rewards and an outer loop learning interaction policies, solved by the convergent BISIRL algorithm.

LLM-Enhanced Deep Reinforcement Learning for Task Offloading in Collaborative Edge Computing

cs.DC · 2026-05-07 · unverdicted · novelty 5.0

LeDRL combines a lightweight LLM generating strategy priors from network prompts with a self-attention DRL agent and reflective evaluator, reporting over 17% higher task success rate than baselines in collaborative edge computing.

citing papers explorer

Showing 1 of 1 citing paper after filters.

LLM-Enhanced Deep Reinforcement Learning for Task Offloading in Collaborative Edge Computing cs.DC · 2026-05-07 · unverdicted · none · ref 19
LeDRL combines a lightweight LLM generating strategy priors from network prompts with a self-attention DRL agent and reflective evaluator, reporting over 17% higher task success rate than baselines in collaborative edge computing.

The surprising effectiveness of ppo in cooperative multi-agent games.Advances in Neural Information Processing Systems, 35:24611–24624

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer