A Robust and Constrained Multi-Agent Reinforcement Learning Electric Vehicle Rebalancing Method in AMoD Systems

Fei Miao; Shaofeng Zou; Shuo Han; Sihong He; Yue Wang

arxiv: 2209.08230 · v2 · pith:2EM5AVNKnew · submitted 2022-09-17 · 💻 cs.MA · cs.LG· cs.RO· cs.SY· eess.SY

A Robust and Constrained Multi-Agent Reinforcement Learning Electric Vehicle Rebalancing Method in AMoD Systems

Sihong He , Yue Wang , Shuo Han , Shaofeng Zou , Fei Miao This is my paper

classification 💻 cs.MA cs.LGcs.ROcs.SYeess.SY

keywords modelrobustamodrebalancingsystemsuncertaintiesconstrainedmarl

0 comments

read the original abstract

Electric vehicles (EVs) play critical roles in autonomous mobility-on-demand (AMoD) systems, but their unique charging patterns increase the model uncertainties in AMoD systems (e.g. state transition probability). Since there usually exists a mismatch between the training and test/true environments, incorporating model uncertainty into system design is of critical importance in real-world applications. However, model uncertainties have not been considered explicitly in EV AMoD system rebalancing by existing literature yet, and the coexistence of model uncertainties and constraints that the decision should satisfy makes the problem even more challenging. In this work, we design a robust and constrained multi-agent reinforcement learning (MARL) framework with state transition kernel uncertainty for EV AMoD systems. We then propose a robust and constrained MARL algorithm (ROCOMA) with robust natural policy gradients (RNPG) that trains a robust EV rebalancing policy to balance the supply-demand ratio and the charging utilization rate across the city under model uncertainty. Experiments show that the ROCOMA can learn an effective and robust rebalancing policy. It outperforms non-robust MARL methods in the presence of model uncertainties. It increases the system fairness by 19.6% and decreases the rebalancing costs by 75.8%.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

RED: Adaptive Real-Time DAG Scheduling for Robotic Inference under Environmental Dynamics
cs.RO 2026-05 unverdicted novelty 5.0

RED is a deadline-aware DAG scheduler for robotic multi-task inference that adapts to environmental dynamics and supports MIMONet deployment.
PIMbot: A Self-Adaptive Attack Framework for Adversarial Manipulation of Multi-Robot Reinforcement Learning
cs.RO 2026-05 unverdicted novelty 4.0

PIMbot introduces an adaptive attack using reward-channel and policy manipulation to disrupt cooperation in multi-robot social dilemma RL, shown effective in Gazebo simulation and on NVIDIA Jetson hardware.