Assessing adaptive world models in machines with novel games

Lance Ying, Katherine M Collins, Prafull Sharma, Cedric Colas, Kaiya Ivy Zhao, Adrian Weller, Zenna Tavares, Phillip Isola, Samuel J Gershman, Jacob D Andreas, et al · 2025 · arXiv 2507.12821

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

representative citing papers

Open-Ended Video Game Glitch Detection with Agentic Reasoning and Temporal Grounding

cs.MA · 2026-04-09 · unverdicted · novelty 7.0

Introduces the first benchmark for open-ended video game glitch detection with temporal localization and proposes GliDe, an agentic framework that achieves stronger performance than vanilla multimodal models.

Training Language Agents to Learn from Experience

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

Introduces the ICT framework and an RL pipeline to train language agent reflectors that distill experience into reusable prompts, outperforming baselines on held-out tasks in ALFWorld and MiniHack.

stable-worldmodel: A Platform for Reproducible World Modeling Research and Evaluation

cs.LG · 2026-05-20 · unverdicted · novelty 5.0

The paper presents stable-worldmodel (swm), a platform with high-performance data layer, modern world model baselines, planning solvers, and extended environments for reproducible research and generalization evaluation.

Scalable Environments Drive Generalizable Agents

cs.AI · 2026-05-18 · unverdicted · novelty 5.0

Generalizable agents require environment scaling via diverse executable rule-sets, distinguished from trajectory and task scaling in a new taxonomy.

Hypothesis Generation and Inductive Inference in Children and Language Models

cs.AI · 2026-05-23 · unverdicted · novelty 4.0

Children and LLM agents show parallel adaptations to evidence reliability in a Bayesian program induction task but differ in information-seeking costs and compliance.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Training Language Agents to Learn from Experience cs.LG · 2026-05-19 · unverdicted · none · ref 24
Introduces the ICT framework and an RL pipeline to train language agent reflectors that distill experience into reusable prompts, outperforming baselines on held-out tasks in ALFWorld and MiniHack.
stable-worldmodel: A Platform for Reproducible World Modeling Research and Evaluation cs.LG · 2026-05-20 · unverdicted · none · ref 62
The paper presents stable-worldmodel (swm), a platform with high-performance data layer, modern world model baselines, planning solvers, and extended environments for reproducible research and generalization evaluation.

Assessing adaptive world models in machines with novel games

fields

years

verdicts

representative citing papers

citing papers explorer