BRRL derives an analytic optimal policy for regularized constrained RL that guarantees monotonic improvement and yields the BPO algorithm that matches or exceeds PPO.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2representative citing papers
SALSA-RL introduces latent-space stability analysis for actions of pretrained RL agents using encoder-decoder and state-dependent linear dynamics to enable non-invasive interpretability.
citing papers explorer
-
Bounded Ratio Reinforcement Learning
BRRL derives an analytic optimal policy for regularized constrained RL that guarantees monotonic improvement and yields the BPO algorithm that matches or exceeds PPO.
-
SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning
SALSA-RL introduces latent-space stability analysis for actions of pretrained RL agents using encoder-decoder and state-dependent linear dynamics to enable non-invasive interpretability.