GCRL and MISL are unified as control maximization, with three inequivalent GCRL formulations each matched to a MISL objective via bounds on goal-sensitivity.
METRA: Scalable unsupervised RL with metric-aware abstraction
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2verdicts
UNVERDICTED 2representative citing papers
InFOM applies flow matching to model intention-conditioned occupancy measures for RL pre-training, reporting 1.8x median return gains and 36% higher success rates on benchmarks.
citing papers explorer
-
Unifying Goal-Conditioned RL and Unsupervised Skill Learning via Control-Maximization
GCRL and MISL are unified as control maximization, with three inequivalent GCRL formulations each matched to a MISL objective via bounds on goal-sensitivity.
-
Intention-Conditioned Flow Occupancy Models
InFOM applies flow matching to model intention-conditioned occupancy measures for RL pre-training, reporting 1.8x median return gains and 36% higher success rates on benchmarks.