ATRS uses a shared neural policy in a multi-agent MDP to adaptively re-split trajectory segments during parallel ADMM optimization, cutting iterations by up to 26% and time by 19.1% with zero-shot generalization.
Markov games as a framework for multi-agent rein- forcement learning
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Provides the first finite-time convergence guarantees for Q-value iteration in general-sum Stackelberg Markov games.
citing papers explorer
-
ATRS: Adaptive Trajectory Re-splitting via a Shared Neural Policy for Parallel Optimization
ATRS uses a shared neural policy in a multi-agent MDP to adaptively re-split trajectory segments during parallel ADMM optimization, cutting iterations by up to 26% and time by 19.1% with zero-shot generalization.
-
Finite-Time Analysis of Q-Value Iteration for General-Sum Stackelberg Games
Provides the first finite-time convergence guarantees for Q-value iteration in general-sum Stackelberg Markov games.