Recognition: unknown
Recurrent World Models Facilitate Policy Evolution
read the original abstract
A generative recurrent neural network is quickly trained in an unsupervised manner to model popular reinforcement learning environments through compressed spatio-temporal representations. The world model's extracted features are fed into compact and simple policies trained by evolution, achieving state of the art results in various environments. We also train our agent entirely inside of an environment generated by its own internal world model, and transfer this policy back into the actual environment. Interactive version of paper at https://worldmodels.github.io
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
Simulating clinical interventions with a generative multimodal model of human physiology
HealthFormer is a generative multimodal transformer that forecasts individual physiological trajectories and simulates clinical interventions, outperforming clinical risk scores on disease prediction and matching tria...
-
Grounded World Model for Semantically Generalizable Planning
A vision-language-aligned world model turns visuomotor MPC into a language-following planner that reaches 87% success on 288 unseen semantic tasks where standard VLAs drop to 22%.
-
Safety, Security, and Cognitive Risks in World Models
World models enable efficient AI planning but create risks from adversarial corruption, goal misgeneralization, and human bias, demonstrated via attacks that amplify errors and reduce rewards on models like RSSM and D...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.