International Conference on Machine Learning , pages=

Contrastive energy prediction for exact energy-guided diffusion sampling in offline reinforcement learning , author= · 2023

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Beyond Penalization: Diffusion-based Out-of-Distribution Detection and Selective Regularization in Offline Reinforcement Learning

cs.LG · 2026-05-06 · unverdicted · novelty 7.0

DOSER detects OOD actions via diffusion-model denoising error and applies selective regularization based on predicted transitions, proving gamma-contraction with performance bounds and outperforming priors on offline RL benchmarks.

Ada-Diffuser: Latent-Aware Adaptive Diffusion for Decision-Making

cs.LG · 2026-05-15 · unverdicted · novelty 6.0

Ada-Diffuser is a causal diffusion model that jointly learns observed interaction structure and underlying latent dynamics from minimal observations for adaptive planning and policy learning.

Discrete Flow Matching for Offline-to-Online Reinforcement Learning

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

DRIFT enables stable offline-to-online fine-tuning of CTMC policies in discrete RL via advantage-weighted discrete flow matching, path-space regularization, and candidate-set approximation.

Path-Coupled Bellman Flows for Distributional Reinforcement Learning

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

PCBF learns return distributions via source-consistent Bellman-coupled paths with shared noise and λ-parameterized control variates, reporting improved fidelity and stability on MRPs, OGBench, and D4RL.

citing papers explorer

Showing 4 of 4 citing papers.

Beyond Penalization: Diffusion-based Out-of-Distribution Detection and Selective Regularization in Offline Reinforcement Learning cs.LG · 2026-05-06 · unverdicted · none · ref 24
DOSER detects OOD actions via diffusion-model denoising error and applies selective regularization based on predicted transitions, proving gamma-contraction with performance bounds and outperforming priors on offline RL benchmarks.
Ada-Diffuser: Latent-Aware Adaptive Diffusion for Decision-Making cs.LG · 2026-05-15 · unverdicted · none · ref 27
Ada-Diffuser is a causal diffusion model that jointly learns observed interaction structure and underlying latent dynamics from minimal observations for adaptive planning and policy learning.
Discrete Flow Matching for Offline-to-Online Reinforcement Learning cs.LG · 2026-05-12 · unverdicted · none · ref 44
DRIFT enables stable offline-to-online fine-tuning of CTMC policies in discrete RL via advantage-weighted discrete flow matching, path-space regularization, and candidate-set approximation.
Path-Coupled Bellman Flows for Distributional Reinforcement Learning cs.LG · 2026-05-07 · unverdicted · none · ref 35
PCBF learns return distributions via source-consistent Bellman-coupled paths with shared noise and λ-parameterized control variates, reporting improved fidelity and stability on MRPs, OGBench, and D4RL.

International Conference on Machine Learning , pages=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer