hub

Llm-based multi-agent reinforcement learning: Current and future directions

Sun, Chuanneng, Huang, Songjun, Pompili, Dario , month = may, year = · 2023 · arXiv 2405.11106

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

read on arXiv browse 13 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

ReCrit: Transition-Aware Reinforcement Learning for Scientific Critic Reasoning

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

ReCrit frames critic interaction as a correctness-transition problem and uses quadrant-based RL rewards to improve LLM performance on scientific reasoning benchmarks by rewarding corrections and robustness while penalizing sycophancy.

Multi-Agent Coordination Adaptation via Structure-Guided Orchestration

cs.MA · 2026-05-25 · unverdicted · novelty 6.0

MACA frames multi-agent coordination as posterior inference, learns a structural prior to guide orchestration, and reports 8.42% higher performance with 43.19% fewer tokens than adaptive baselines on benchmarks.

Robust Instruction Compliance in Cooperative Multi-Agent Reinforcement Learning

cs.AI · 2026-05-12 · unverdicted · novelty 6.0

MAVIC corrects Bellman backups at instruction boundaries by adjusting the incoming objective and restoring continuation value, enabling consistent estimation under stochastic instruction switching in cooperative MARL.

Do LLM-derived graph priors improve multi-agent coordination?

cs.LG · 2026-04-19 · unverdicted · novelty 6.0

LLM-generated coordination graph priors improve multi-agent reinforcement learning performance on MPE benchmarks, with models as small as 1.5B parameters proving effective.

Joint Optimization of Multi-agent Memory System

cs.MA · 2026-03-13 · unverdicted · novelty 6.0

CoMAM jointly optimizes agents in multi-agent LLM memory systems via end-to-end RL and adaptive credit assignment to improve collaboration and performance.

Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents

cs.CL · 2026-01-05 · unverdicted · novelty 6.0

AgeMem unifies long-term and short-term memory management in LLM agents by exposing memory operations as learnable tool actions trained via three-stage progressive reinforcement learning, outperforming baselines on long-horizon tasks.

WebSailor: Navigating Super-human Reasoning for Web Agent

cs.CL · 2025-07-03 · conditional · novelty 6.0

WebSailor trains open-source web agents to match proprietary performance on complex information-seeking tasks by generating high-uncertainty scenarios and using a new RL method called DUPO.

CoEvolve: Training LLM Agents via Agent-Data Mutual Evolution

cs.CL · 2026-04-17 · unverdicted · novelty 5.0

CoEvolve improves LLM agent performance by 15-19% on AppWorld and BFCL benchmarks through mutual evolution of the agent and data distribution using feedback-driven task synthesis.

Adaptive Obstacle-Aware Task Assignment and Planning for Heterogeneous Robot Teaming

cs.RO · 2025-10-15 · unverdicted · novelty 5.0

OATH combines adaptive Halton sampling, obstacle-aware clustering with auctions, and LLM-based instruction interpretation to improve task assignment and planning for heterogeneous robot teams in obstacle-rich environments.

Multi-Agent Systems: From Classical Paradigms to Large Foundation Model-Enabled Futures

cs.AI · 2026-04-20 · unverdicted · novelty 4.0

A survey comparing classical multi-agent systems with large foundation model-enabled multi-agent systems, showing how the latter enables semantic-level collaboration and greater adaptability.

A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence

cs.AI · 2025-07-28 · accept · novelty 4.0

The paper delivers the first systematic review of self-evolving agents, structured around what components evolve, when adaptation occurs, and how it is implemented.

Multi-Agent Collaboration Mechanisms: A Survey of LLMs

cs.AI · 2025-01-10 · unverdicted · novelty 4.0

The survey organizes LLM-based multi-agent collaboration mechanisms into a framework with dimensions of actors, types, structures, strategies, and coordination protocols, reviews applications across domains, and identifies challenges for future research.

Large Language Model-Brained GUI Agents: A Survey

cs.AI · 2024-11-27 · unverdicted · novelty 4.0

A survey consolidating frameworks, data practices, large action models, benchmarks, applications, and research gaps in LLM-brained GUI agents.

citing papers explorer

Showing 13 of 13 citing papers.

ReCrit: Transition-Aware Reinforcement Learning for Scientific Critic Reasoning cs.LG · 2026-05-11 · unverdicted · none · ref 33
ReCrit frames critic interaction as a correctness-transition problem and uses quadrant-based RL rewards to improve LLM performance on scientific reasoning benchmarks by rewarding corrections and robustness while penalizing sycophancy.
Multi-Agent Coordination Adaptation via Structure-Guided Orchestration cs.MA · 2026-05-25 · unverdicted · none · ref 6
MACA frames multi-agent coordination as posterior inference, learns a structural prior to guide orchestration, and reports 8.42% higher performance with 43.19% fewer tokens than adaptive baselines on benchmarks.
Robust Instruction Compliance in Cooperative Multi-Agent Reinforcement Learning cs.AI · 2026-05-12 · unverdicted · none · ref 48
MAVIC corrects Bellman backups at instruction boundaries by adjusting the incoming objective and restoring continuation value, enabling consistent estimation under stochastic instruction switching in cooperative MARL.
Do LLM-derived graph priors improve multi-agent coordination? cs.LG · 2026-04-19 · unverdicted · none · ref 34
LLM-generated coordination graph priors improve multi-agent reinforcement learning performance on MPE benchmarks, with models as small as 1.5B parameters proving effective.
Joint Optimization of Multi-agent Memory System cs.MA · 2026-03-13 · unverdicted · none · ref 26
CoMAM jointly optimizes agents in multi-agent LLM memory systems via end-to-end RL and adaptive credit assignment to improve collaboration and performance.
Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents cs.CL · 2026-01-05 · unverdicted · none · ref 2
AgeMem unifies long-term and short-term memory management in LLM agents by exposing memory operations as learnable tool actions trained via three-stage progressive reinforcement learning, outperforming baselines on long-horizon tasks.
WebSailor: Navigating Super-human Reasoning for Web Agent cs.CL · 2025-07-03 · conditional · none · ref 18
WebSailor trains open-source web agents to match proprietary performance on complex information-seeking tasks by generating high-uncertainty scenarios and using a new RL method called DUPO.
CoEvolve: Training LLM Agents via Agent-Data Mutual Evolution cs.CL · 2026-04-17 · unverdicted · none · ref 2
CoEvolve improves LLM agent performance by 15-19% on AppWorld and BFCL benchmarks through mutual evolution of the agent and data distribution using feedback-driven task synthesis.
Adaptive Obstacle-Aware Task Assignment and Planning for Heterogeneous Robot Teaming cs.RO · 2025-10-15 · unverdicted · none · ref 27
OATH combines adaptive Halton sampling, obstacle-aware clustering with auctions, and LLM-based instruction interpretation to improve task assignment and planning for heterogeneous robot teams in obstacle-rich environments.
Multi-Agent Systems: From Classical Paradigms to Large Foundation Model-Enabled Futures cs.AI · 2026-04-20 · unverdicted · none · ref 131
A survey comparing classical multi-agent systems with large foundation model-enabled multi-agent systems, showing how the latter enables semantic-level collaboration and greater adaptability.
A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence cs.AI · 2025-07-28 · accept · none · ref 125
The paper delivers the first systematic review of self-evolving agents, structured around what components evolve, when adaptation occurs, and how it is implemented.
Multi-Agent Collaboration Mechanisms: A Survey of LLMs cs.AI · 2025-01-10 · unverdicted · none · ref 118
The survey organizes LLM-based multi-agent collaboration mechanisms into a framework with dimensions of actors, types, structures, strategies, and coordination protocols, reviews applications across domains, and identifies challenges for future research.
Large Language Model-Brained GUI Agents: A Survey cs.AI · 2024-11-27 · unverdicted · none · ref 51
A survey consolidating frameworks, data practices, large action models, benchmarks, applications, and research gaps in LLM-brained GUI agents.

Llm-based multi-agent reinforcement learning: Current and future directions

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer