{"total":17,"items":[{"citing_arxiv_id":"2606.28476","ref_index":5,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"FADA: Few-Shot Domain Adaptation via Dynamics Alignment for Humanoid Control","primary_cat":"cs.RO","submitted_at":"2026-06-26T16:05:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"FADA is a three-stage Planner-IDM method that achieves few-shot domain adaptation for humanoid control by distilling an oracle policy then finetuning only the IDM on short target-domain rollouts via supervised learning.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.03297","ref_index":12,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"SplitAdapter: Load-Aware Humanoid Loco-Manipulation via Factorized Adaptation","primary_cat":"cs.RO","submitted_at":"2026-06-02T08:10:49+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"SplitAdapter factorizes adaptation into load-aware and dynamics-aware encoders using split world-model objectives, GRL regularization, and hierarchical FiLM, reporting higher full-task success than baselines across 2-6 kg masses and 0-60 cm heights.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.00133","ref_index":237,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"World Models: A Comprehensive Survey of Architectures, Methodologies, Reasoning Paradigms, and Applications","primary_cat":"cs.LG","submitted_at":"2026-05-28T21:23:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"The paper delivers a multi-axis taxonomy for world models that maps architectures, training families, reasoning strategies, and domains from early cognitive foundations through systems such as Dreamer, MuZero, and Sora while noting evaluation gaps.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.00113","ref_index":97,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"World Models for Robotic Manipulation: A Survey","primary_cat":"cs.RO","submitted_at":"2026-05-27T05:32:17+00:00","verdict":"ACCEPT","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Survey organizing world models for robotic manipulation into representation families, a functional taxonomy, and infrastructure roles across pretraining, post-training, and inference, while reviewing 34 datasets and evaluation protocols.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.17537","ref_index":6,"ref_count":2,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Self-supervised Hierarchical Visual Reasoning with World Model","primary_cat":"cs.AI","submitted_at":"2026-05-17T16:42:42+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"ResDreamer proposes a residual-reconstruction hierarchical world model for purely self-supervised visual foresight that claims SOTA sample and parameter efficiency in open-world RL.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.12084","ref_index":40,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Learning What Matters: Adaptive Information-Theoretic Objectives for Robot Exploration","primary_cat":"cs.RO","submitted_at":"2026-05-12T13:07:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"QOED selects identifiable parameter directions via Fisher matrix eigenspace analysis and modifies exploration objectives to approximate ideal information gain under bounded nuisance assumptions, yielding 21-35% performance gains in robotic tasks.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"is in Appendix B, following Eq. (4). We parameterizeT θ with a Transformer [52]: state, action, and parameter inputs are embedded by separate MLPs, processed by a six-block Transformer with Adaptive Layer Normalization (AdaLN), and mapped to the output by a final AdaLN modulation layer. Architecture details are in Appendix G. To learn the policy, we follow the MBPO paradigm [40] us- ing a learned dynamics modelq θ to train a PPO [53] policyπ. Specifically, we follow the Robotic World Model [40] pipeline. To prevent catastrophic failure and initialize the dynamics model with physics knowledge, we pretrain both the policy πand the dynamics modelq θ in simulation with domain randomization of physics parameters [54]. Upon deployment,"},{"citing_arxiv_id":"2605.07079","ref_index":31,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Learning Visual Feature-Based World Models via Residual Latent Action","primary_cat":"cs.CV","submitted_at":"2026-05-08T00:58:16+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"RLA-WM predicts residual latent actions via flow matching to create visual feature world models that outperform prior feature-based and diffusion approaches while enabling offline video-based robot RL.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"former: A 3d point cloud world model for multi-object, multi-material robotic manipulation. arXiv preprint arXiv:2506.23126, 2025. [30] Ying Chai, Litao Deng, Ruizhi Shao, Jiajun Zhang, Kangchen Lv, Liangjun Xing, Xiang Li, Hongwen Zhang, and Y ebin Liu. Gaf: Gaussian action ﬁeld as a 4d representation for dynamic world modeling in robotic manipulation. arXiv preprint arXiv:2506.14135, 2025. [31] Chenhao Li, Andreas Krause, and Marco Hutter. Robotic world model: A neural network simulator for robust policy optimization in robotics. arXiv preprint arXiv:2501.10100, 2025. [32] SV Jyothir, Siddhartha Jalagam, Y ann LeCun, and Vlad Sobal. Gradient-based planning with world models. arXiv preprint arXiv:2312.17227, pages 703-708, 2023. [33] Nicklas Hansen, Hao Su, and Xiaolong Wang."},{"citing_arxiv_id":"2604.21741","ref_index":33,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Hi-WM: Human-in-the-World-Model for Scalable Robot Post-Training","primary_cat":"cs.RO","submitted_at":"2026-04-23T14:42:54+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Hi-WM uses human interventions inside an action-conditioned world model with rollback and branching to generate dense corrective data, raising real-world success by 37.9 points on average across three manipulation tasks.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"[31] Michael Laskey, Jonathan Lee, Roy Fox, Anca Dragan, and Ken Goldberg. DART: Noise injection for robust imitation learning. InProceedings of the 1st Annual Conference on Robot Learning, pages 143-156, 2017. [32] Chenhao Li, Andreas Krause, and Marco Hutter. Robotic world model: A neural network simulator for robust policy optimization in robotics.arXiv preprint arXiv:2501.10100, 2025. [33] Xuanlin Li, Kyle Hsu, Jiayuan Gu, Oier Mees, Karl Pertsch, Homer Rich Walke, Chuyuan Fu, Ishikaa Lunawat, Isabel Sieh, Sean Kirmani, Sergey Levine, Jiajun Wu, Chelsea Finn, Hao Su, Quan Vuong, and Ted Xiao. Evaluating real-world robot manipulation policies in simulation. InProceedings of The 8th Conference on Robot Learning, pages 3705-3728, 2025."},{"citing_arxiv_id":"2604.21456","ref_index":63,"ref_count":2,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Tempered Sequential Monte Carlo for Trajectory and Policy Optimization with Differentiable Dynamics","primary_cat":"cs.LG","submitted_at":"2026-04-23T09:13:59+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Tempered sequential Monte Carlo samples from a Boltzmann-tilted distribution over controllers to optimize trajectories and policies under differentiable dynamics.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Differentiable Simulation for Robotics\". In: arXiv preprint arXiv:2409.07107 (2025). [61] Taeyoung Lee. \"Computational geometric mechanics and control of rigid bodies\". PhD thesis. University of Michigan, 2008. [62] Sergey Levine. \"Reinforcement learning and control as probabilistic inference: Tutorial and review\". In: arXiv preprint arXiv:1805.00909 (2018). [63] Chenhao Li, Andreas Krause, and Marco Hutter. \"Robotic world model: A neural network simulator for robust policy optimization in robotics\". In: arXiv preprint arXiv:2501.10100 (2025). [64] Yulin Li, Haoyu Han, Shucheng Kang, Jun Ma, and Heng Yang. \"On the Surprising Robustness of Sequential Convex Optimization for Contact-Implicit Motion Planning\"."},{"citing_arxiv_id":"2604.20990","ref_index":111,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"A Survey of Legged Robotics in Non-Inertial Environments: Past, Present, and Future","primary_cat":"cs.RO","submitted_at":"2026-04-22T18:21:49+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":2.0,"formal_verification":"none","one_line_summary":"A literature survey summarizing modeling, state estimation, control methods, applications, and open challenges for legged robots operating in non-inertial environments where the ground moves or accelerates.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"complex hybrid dynamics. Factor graphs [ 106], [ 80], [ 107] offer a natural extension for enforcing long-horizon con- sistency through loop closures and multi-sensor fusion. In parallel, learning-based methods may help capture nonlinear robot-platform interactions, building on recent progress in learned state estimation [ 108], [ 109] and robot dynamic modeling [ 110], [ 111]. Fourth, progress is also limited by the lack of public datasets and standardized benchmarks for non-inertial locomotion. Most existing datasets [ 112], [ 113], [ 114] as- sume stationary terrain and do not capture coupled robot- platform dynamics in settings such as ships, aircraft, or trains. Moreover, current evaluations are often restricted to controlled treadmill experiments."},{"citing_arxiv_id":"2604.20246","ref_index":32,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Cortex 2.0: Grounding World Models in Real-World Industrial Deployment","primary_cat":"cs.RO","submitted_at":"2026-04-22T06:49:12+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Cortex 2.0 introduces world-model-based planning that generates and scores future trajectories to outperform reactive vision-language-action baselines on industrial robotic tasks including pick-and-place, sorting, and unpacking.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"have explored using such models at inference time: IRASim [28] and GPC [29] showed that scoring candidate rollouts before execution improves task success over reactive policies, while GR-2 [30] and V-JEPA 2 [31] validated that joint pretraining on internet video and robot data supports strong physical reasoning with limited robot-specific supervision. Li et al. [32] further demonstrated this direction on deployment data. Cortex 2.0 builds on these findings by grounding world model training in continuously collected operational data and scoring imagined rollouts via PRO before any action is executed. Cortex 2.0 follows this direction: the world model is pretrained on internet-scale video and fine-tuned on deployment recordings at 30Hz ."},{"citing_arxiv_id":"2604.08780","ref_index":42,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Toward Hardware-Agnostic Quadrupedal World Models via Morphology Conditioning","primary_cat":"cs.RO","submitted_at":"2026-04-09T21:31:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Morphology-conditioned quadrupedal world model enables zero-shot generalization to new robot embodiments for locomotion tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.06168","ref_index":32,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Action Images: End-to-End Policy Learning via Multiview Video Generation","primary_cat":"cs.CV","submitted_at":"2026-04-07T17:59:30+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Action Images turn robot arm motions into interpretable multiview pixel videos, letting video backbones serve as zero-shot policies for end-to-end robot learning.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":", Du, Y., Sun, S.H., Tenenbaum, J.B.: Learning to act from actionless videos through dense correspondences. arXiv preprint arXiv:2310.08576 (2023) [31] Lee, J., Duan, J., Fang, H., Deng, Y., Liu, S., Li, B., Fang, B., Zhang, J., Wang, Y.R., Lee, S., et al.: Molmoact: Action reasoning models that can reason in space. arXiv preprint arXiv:2508.07917 (2025) [32] Li, C., Krause, A., Hutter, M.: Robotic world model: A neural net- work simulator for robust policy optimization in robotics. arXiv preprint arXiv:2501.10100 (2025) [33] Li, P., Chen, Y., Xu, Y., Yang, J., Wu, X., Guo, J., Sun, N., Qian, L., Li, X., Xiao, X., Liu, J., Liu, N., Kong, T., Huang, Y., Wang, L., Tan, T.: Multi-view video diffusion policy: A 3d spatio-temporal-aware video action"},{"citing_arxiv_id":"2604.01346","ref_index":8,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Safety, Security, and Cognitive Risks in World Models","primary_cat":"cs.CR","submitted_at":"2026-04-01T19:57:33+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"World models enable efficient AI planning but create risks from adversarial corruption, goal misgeneralization, and human bias, demonstrated via attacks that amplify errors and reduce rewards on models like RSSM and DreamerV3.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"World models are being deployed or actively developed for: • Autonomous driving: MILE [4], DriveDreamer [6], and GAIA-1 [5] use world models to simulate rare and adversarial traffic scenarios, train safety policies, and improve corner-case coverage. • Robotics: UniSim [ 7] learns interactive real-world simulators for zero-shot policy transfer; robotic world models [8] enable offline model-based RL on physical systems. • Agentic AI: LLM-based agents increasingly use world models-or world-model-like reasoning modules-for multi-step planning and counterfactual deliberation [10, 49]. • Social simulation: Foundation world models trained on video and text generate synthetic social environments for training and evaluation, with direct implications for influence operations and manipulation [50]."},{"citing_arxiv_id":"2602.11758","ref_index":27,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"HAIC: Humanoid Agile Object Interaction Control via Dynamics-Aware World Model","primary_cat":"cs.RO","submitted_at":"2026-02-12T09:34:35+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"HAIC enables robust humanoid interactions with underactuated objects by predicting their dynamics from proprioceptive history and using a world model for adaptive control.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2602.11075","ref_index":54,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"RISE: Self-Improving Robot Policy with Compositional World Model","primary_cat":"cs.RO","submitted_at":"2026-02-11T17:43:36+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"RISE combines a controllable dynamics model and progress value model into a closed-loop self-improving pipeline that updates robot policies entirely in imagination, reporting over 35% absolute gains on three real-world tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2511.17792","ref_index":19,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Target-Bench: Can Video World Models Achieve Mapless Path Planning with Semantic Targets?","primary_cat":"cs.CV","submitted_at":"2025-11-21T21:36:02+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Target-Bench shows the best off-the-shelf video world model scores only 0.341 on semantic target-approaching and directional consistency, with fine-tuning on a small robot dataset yielding measurable gains.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}