Entangled QMARL agents approach the Tsirelson bound of 0.854 in CHSH while unentangled versions match classical baselines, and hybrid quantum-classical setups outperform both in CoopNav.
Multi-agent actor-critic for mixed cooperative-competitive environments
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
MADDPG-K scales centralized critics in multi-agent RL by limiting each critic to k-nearest neighbors under Euclidean distance, yielding constant input size and competitive performance.
CL-MARL uses an adaptive curriculum scheduler called FlexDiff and Counterfactual Group Relative Policy Advantage to break static-difficulty training in MARL and achieve higher win rates on hard StarCraft maps.
AsynCoMARL is a new asynchronous MARL algorithm that matches leading baselines on success and collision rates while using 26% fewer messages via graph transformers on dynamic communication graphs.
In a 25-building district simulation, the hybrid MPC-SAC architecture delivered the strongest balance of load tracking accuracy (4.8% NMBE), thermal comfort (16.8% exceedance), and lowest spatial variability compared to centralized MPC, decentralized SAC, MAPPO, and rule-based control.
A communication-efficient multi-agent actor-critic algorithm solves distributed RL on strongly connected directed graphs by transmitting only two scalar values per communication step.
Large vision-language models applied to multi-scale remote sensing imagery can generate recommendations on built environment design, constructability, land use, and risks for smart city decision-making.
TD-MARL uses shared topological states and invariants to coordinate soft robots and reduce entanglement risk, outperforming standard DRL in simulated convergence and anti-winding performance.
citing papers explorer
-
Quantum Advantage in Multi Agent Reinforcement Learning
Entangled QMARL agents approach the Tsirelson bound of 0.854 in CHSH while unentangled versions match classical baselines, and hybrid quantum-classical setups outperform both in CoopNav.
-
Scalable Neighborhood-Based Multi-Agent Actor-Critic
MADDPG-K scales centralized critics in multi-agent RL by limiting each critic to k-nearest neighbors under Euclidean distance, yielding constant input size and competitive performance.
-
Overcoming Environmental Meta-Stationarity in MARL via Adaptive Curriculum and Counterfactual Group Advantage
CL-MARL uses an adaptive curriculum scheduler called FlexDiff and Counterfactual Group Relative Policy Advantage to break static-difficulty training in MARL and achieve higher win rates on hard StarCraft maps.
-
Asynchronous Cooperative Multi-Agent Reinforcement Learning with Limited Communication
AsynCoMARL is a new asynchronous MARL algorithm that matches leading baselines on success and collision rates while using 26% fewer messages via graph transformers on dynamic communication graphs.
-
Coordination Architecture Shapes Continuous Demand Response Outcomes in Building Districts
In a 25-building district simulation, the hybrid MPC-SAC architecture delivered the strongest balance of load tracking accuracy (4.8% NMBE), thermal comfort (16.8% exceedance), and lowest spatial variability compared to centralized MPC, decentralized SAC, MAPPO, and rule-based control.
-
A Communication-Efficient Multi-Agent Actor-Critic Algorithm for Distributed Reinforcement Learning
A communication-efficient multi-agent actor-critic algorithm solves distributed RL on strongly connected directed graphs by transmitting only two scalar values per communication step.
-
Built Environment Reasoning from Remote Sensing Imagery Using Large Vision--Language Models
Large vision-language models applied to multi-scale remote sensing imagery can generate recommendations on built environment design, constructability, land use, and risks for smart city decision-making.
-
Topology-Driven Anti-Entanglement Control for Soft Robots
TD-MARL uses shared topological states and invariants to coordinate soft robots and reduce entanglement risk, outperforming standard DRL in simulated convergence and anti-winding performance.
- A Distributionally Robust Reinforcement Learning Framework for Constrained Urban EV Dispatch