pith. sign in

arxiv: 2512.02019 · v3 · pith:F75ZDXRVnew · submitted 2025-12-01 · 💻 cs.LG · cs.AI· stat.ML

Diffusion-Augmented Markov Decision Processes for Maximum Entropy Reinforcement Learning

classification 💻 cs.LG cs.AIstat.ML
keywords policydiffusionda-mdpdistributionsentropyoptimizationprocessesbenchmarks
0
0 comments X
read the original abstract

Diffusion models excel at sampling from complex, unnormalized distributions. In this work, we extend Maximum Entropy Reinforcement Learning (ME-RL) to diffusion processes, enabling sampling from the optimal policy trajectory distribution. By minimizing a tractable upper bound on the reverse KL divergence between the diffusion policy and the optimal policy trajectory distributions, we derive a modified surrogate objective and introduce Diffusion-Augmented Markov Decision Processes (DA-MDPs). DA-MDPs allow for seamless integration of diffusion policies into any ME-RL method with minimal modifications. We demonstrate its effectiveness by adapting Proximal Policy Optimization (PPO), Wasserstein Policy Optimization (WPO), and Relative Entropy Pathwise Policy Optimization (REPPO) into their diffusion-based variants: DA-MDP: PPO, DA-MDP: WPO, and DA-MDP: REPPO. Empirical results on standard continuous-control benchmarks show that our approach matches or outperforms baseline methods, while experiments on multimodal benchmarks confirm its ability to model multimodal action distributions.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Guided Discovery of New Behaviors using Diffusion Policies

    cs.RO 2026-06 unverdicted novelty 6.0

    A framework combining Feynman-Kac correctors with a guiding potential mines and repairs novel trajectories to enable diffusion policies to discover diverse executable behaviors in robotic manipulation.