TCE bridges domain gaps in offline RL by selectively using source data or generating target-aligned transitions via a dual score-based model, outperforming baselines in experiments.
Diffstitch: Boosting offline reinforcement learning with diffusion-based trajectory stitching
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2representative citing papers
BiTrajDiff augments offline RL datasets by running independent forward and backward diffusion processes from intermediate states, yielding higher performance than prior one-directional data-augmentation baselines on D4RL.
citing papers explorer
-
Bridging Domain Gaps with Target-Aligned Generation for Offline Reinforcement Learning
TCE bridges domain gaps in offline RL by selectively using source data or generating target-aligned transitions via a dual score-based model, outperforming baselines in experiments.
-
BiTrajDiff: Bidirectional Trajectory Generation with Diffusion Models for Offline Reinforcement Learning
BiTrajDiff augments offline RL datasets by running independent forward and backward diffusion processes from intermediate states, yielding higher performance than prior one-directional data-augmentation baselines on D4RL.