We increased the R-Net size to an MLP of 64 units (we did not see an improvement in classification accuracy for large sizes) and used a local distance thresholdτ= 10 on all tasks

Furthermore, noticing that Go-Fresh is tested on relatively simpler tasks compared to ours in their paper (Mezghani et al · 2023

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Occupancy Reward Shaping: Improving Credit Assignment for Offline Goal-Conditioned Reinforcement Learning

cs.LG · 2026-04-22 · conditional · novelty 6.0

Occupancy Reward Shaping extracts goal-reaching rewards from world-model occupancy measures using optimal transport, improving offline goal-conditioned RL performance 2.2x on 13 tasks without changing the optimal policy.

citing papers explorer

Showing 1 of 1 citing paper.

Occupancy Reward Shaping: Improving Credit Assignment for Offline Goal-Conditioned Reinforcement Learning cs.LG · 2026-04-22 · conditional · none · ref 37
Occupancy Reward Shaping extracts goal-reaching rewards from world-model occupancy measures using optimal transport, improving offline goal-conditioned RL performance 2.2x on 13 tasks without changing the optimal policy.

We increased the R-Net size to an MLP of 64 units (we did not see an improvement in classification accuracy for large sizes) and used a local distance thresholdτ= 10 on all tasks

fields

years

verdicts

representative citing papers

citing papers explorer