TRIDENT is a MARL framework using Richardson-Romberg gradient correction, Lyapunov-constrained trust-region updates, and a physics-informed residual critic that claims O(1/sqrt(K)) convergence to constrained Nash equilibrium with O(sqrt(K)) violation bounds and large reductions in training violation
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Phi-Actor-Critic is a new method that steers multi-agent reinforcement learning toward Pareto-efficient correlated equilibria using regret minimization and Lagrangian selection.
citing papers explorer
-
TRIDENT: Breaking the Hybrid-Safety-Physics Coupling for Provably Safe Multi-Agent Reinforcement Learning
TRIDENT is a MARL framework using Richardson-Romberg gradient correction, Lyapunov-constrained trust-region updates, and a physics-informed residual critic that claims O(1/sqrt(K)) convergence to constrained Nash equilibrium with O(sqrt(K)) violation bounds and large reductions in training violation
-
Phi-Actor-Critic: Steering General-Sum Games to Pareto-Efficient Correlated Equilibria
Phi-Actor-Critic is a new method that steers multi-agent reinforcement learning toward Pareto-efficient correlated equilibria using regret minimization and Lagrangian selection.