AdaGamma stabilizes state-dependent discounting in deep actor-critic RL by adding a return-consistency regularizer, delivering gains on continuous-control benchmarks and a real-world logistics A/B test.
Rothkopf, and Heinz Koeppl
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
AdaGamma: State-Dependent Discounting for Temporal Adaptation in Reinforcement Learning
AdaGamma stabilizes state-dependent discounting in deep actor-critic RL by adding a return-consistency regularizer, delivering gains on continuous-control benchmarks and a real-world logistics A/B test.