Team-symmetric games always have team-symmetric Nash equilibria solvable via linear complementarity problems, and the DelAC actor-critic MARL algorithm outperforms existing methods in simulations.
hub
Benchmarking multi-agent deep reinforcement learn- ing algorithms in cooperative tasks
10 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
MARS replaces additive clipping and soft penalties in multi-agent trust-region methods with a symmetric geometric barrier, matching or exceeding MAPPO and MASPO performance across 47 tasks in eight environments.
SACHI enriches agent representations via graph transformer convolutions over inter-agent graphs to enable holistic information integration, outperforming baselines across five cooperative tasks with statistical significance.
SOAR is a unified DRL method using soft allocations, event-driven MDP, and heterogeneous graph transformers that cuts global makespan by 7.5% and average order completion time by 15.4% at sub-100ms latency in RMFS.
MADDPG-K scales centralized critics in multi-agent RL by limiting each critic to k-nearest neighbors under Euclidean distance, yielding constant input size and competitive performance.
Evo-Memory is a new streaming benchmark and evaluation framework for self-evolving memory in LLM agents, unifying over ten memory modules and introducing the ReMem pipeline for continual improvement on multi-turn and reasoning datasets.
AdaFair-MARL enforces workload fairness as an explicit second-order cone constraint in cooperative MARL via adaptive primal-dual optimization, achieving near-perfect constraint satisfaction while preserving team performance.
Optimistic ε-Greedy Exploration adds decoupled optimistic networks that converge in probability to maximum returns and samples from them with probability ε to increase optimal joint-action frequency in CTDE MARL.
An empirical literature analysis reveals a bifurcation in RL environments into Semantic Prior (LLM-dominated) and Domain-Specific Generalization ecosystems with distinct cognitive fingerprints.
CoSER adaptively samples joint actions in CTDE MARL to reduce sampling error relative to the joint on-policy distribution, empirically improving reliability of independent policy gradient convergence.
citing papers explorer
-
DelAC: A Multi-agent Reinforcement Learning of Team-Symmetric Stochastic Games
Team-symmetric games always have team-symmetric Nash equilibria solvable via linear complementarity problems, and the DelAC actor-critic MARL algorithm outperforms existing methods in simulations.
-
Rethinking Ratio-Based Trust Regions for Policy Optimization in Multi-Agent Reinforcement Learning
MARS replaces additive clipping and soft penalties in multi-agent trust-region methods with a symmetric geometric barrier, matching or exceeding MAPPO and MASPO performance across 47 tasks in eight environments.
-
SACHI: Structured Agent Coordination via Holistic Information Integration in Multi-Agent Reinforcement Learning
SACHI enriches agent representations via graph transformer convolutions over inter-agent graphs to enable holistic information integration, outperforming baselines across five cooperative tasks with statistical significance.
-
SOAR: Real-Time Joint Optimization of Order Allocation and Robot Scheduling in Robotic Mobile Fulfillment Systems
SOAR is a unified DRL method using soft allocations, event-driven MDP, and heterogeneous graph transformers that cuts global makespan by 7.5% and average order completion time by 15.4% at sub-100ms latency in RMFS.
-
Scalable Neighborhood-Based Multi-Agent Actor-Critic
MADDPG-K scales centralized critics in multi-agent RL by limiting each critic to k-nearest neighbors under Euclidean distance, yielding constant input size and competitive performance.
-
Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory
Evo-Memory is a new streaming benchmark and evaluation framework for self-evolving memory in LLM agents, unifying over ten memory modules and introducing the ReMem pipeline for continual improvement on multi-turn and reasoning datasets.
-
AdaFair-MARL: Enforcing Adaptive Fairness Constraints in Multi-Agent Reinforcement Learning
AdaFair-MARL enforces workload fairness as an explicit second-order cone constraint in cooperative MARL via adaptive primal-dual optimization, achieving near-perfect constraint satisfaction while preserving team performance.
-
Optimistic {\epsilon}-Greedy Exploration for Cooperative Multi-Agent Reinforcement Learning
Optimistic ε-Greedy Exploration adds decoupled optimistic networks that converge in probability to maximum returns and samples from them with probability ε to increase optimal joint-action frequency in CTDE MARL.
-
From Pixels to Digital Agents: An Empirical Study on the Taxonomy and Technological Trends of Reinforcement Learning Environments
An empirical literature analysis reveals a bifurcation in RL environments into Semantic Prior (LLM-dominated) and Domain-Specific Generalization ecosystems with distinct cognitive fingerprints.
-
Centralized Adaptive Sampling for Reliable Co-Training of Independent Multi-Agent Policies
CoSER adaptively samples joint actions in CTDE MARL to reduce sampling error relative to the joint on-policy distribution, empirically improving reliability of independent policy gradient convergence.