super hub Canonical reference

World Models

David Ha · 2018 · cs.LG · arXiv 1803.10122

Canonical reference. 88% of citing Pith papers cite this work as background.

141 Pith papers citing it

Background 88% of classified citations

open full Pith review browse 141 citing papers more from David Ha arXiv PDF

abstract

We explore building generative neural network models of popular reinforcement learning environments. Our world model can be trained quickly in an unsupervised manner to learn a compressed spatial and temporal representation of the environment. By using features extracted from the world model as inputs to an agent, we can train a very compact and simple policy that can solve the required task. We can even train our agent entirely inside of its own hallucinated dream generated by its world model, and transfer this policy back into the actual environment. An interactive version of this paper is available at https://worldmodels.github.io/

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 36 method 3 other 1

citation-polarity summary

background 35 use method 3 unclear 2

claims ledger

abstract We explore building generative neural network models of popular reinforcement learning environments. Our world model can be trained quickly in an unsupervised manner to learn a compressed spatial and temporal representation of the environment. By using features extracted from the world model as inputs to an agent, we can train a very compact and simple policy that can solve the required task. We can even train our agent entirely inside of its own hallucinated dream generated by its world model, and transfer this policy back into the actual environment. An interactive version of this paper is

authors

David Ha J\"urgen Schmidhuber

co-cited works

representative citing papers

From Generalist to Specialist Representation

cs.LG · 2026-05-12 · unverdicted · novelty 8.0

Task structure is identifiable across time steps and task-relevant representations are identifiable within steps in a nonparametric setting under sparsity regularization.

EgoMemReason: A Memory-Driven Reasoning Benchmark for Long-Horizon Egocentric Video Understanding

cs.CV · 2026-05-11 · unverdicted · novelty 8.0

EgoMemReason is a new benchmark showing that even the best multimodal models achieve only 39.6% accuracy on reasoning tasks that require integrating sparse evidence across days in egocentric video.

A Model-Free Universal AI

cs.AI · 2026-02-26 · unverdicted · novelty 8.0

AIQI is the first model-free universal AI agent proven asymptotically ε-optimal in general RL by inducing over distributional Q-functions instead of policies or environments.

CRONOS: Benchmarking Counterfactual Physical Consistency in Video Models

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

CRONOS benchmark shows recent open-source video generators fail to preserve physical consistency under controlled changes to viewpoint, scene, object category, and appearance.

MemGym: a Long-Horizon Memory Environment for LLM Agents

cs.CL · 2026-05-20 · unverdicted · novelty 7.0

MemGym unifies agent gyms into a memory benchmark with isolated scoring across tool-use, research, coding, and computer-use regimes plus a lightweight reward model for tractable coding evaluation.

Demo-JEPA: Joint-Embedding Predictive Architecture for One-shot Cross-Embodiment Imitation

cs.RO · 2026-05-20 · unverdicted · novelty 7.0

Demo-JEPA enables one-shot cross-embodiment imitation by mapping visual demonstrations to shared latent future trajectories that serve as subgoals for the target agent's own forward dynamics planning.

Baba in Wonderland: Online Self-Supervised Dynamics Discovery for Executable World Models

cs.AI · 2026-05-16 · unverdicted · novelty 7.0

Alice uses preservation conflicts from failed candidate updates to create class-stratified hypotheses and guide exploration, improving executable world-model learning under prior misalignment.

Learning POMDP World Models from Observations with Language-Model Priors

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

Pinductor leverages language-model priors to learn POMDP world models from limited trajectories, matching privileged-access methods in performance and exceeding tabular baselines in sample efficiency.

JEDI: Joint Embedding Diffusion World Model for Online Model-Based Reinforcement Learning

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

JEDI is the first online end-to-end latent diffusion world model that trains latents from denoising loss rather than reconstruction, achieving competitive Atari100k results with 43% less VRAM and over 3x faster sampling than pixel diffusion baselines.

Runtime Monitoring of Perception-Based Autonomous Systems via Embedding Temporal Logic

cs.LG · 2026-05-12 · unverdicted · novelty 7.0 · 2 refs

Embedding Temporal Logic (ETL) performs runtime monitoring directly in learned embedding spaces using distance-based predicates composed with temporal operators, supported by conformal calibration for reliable predicate evaluation.

Support-Safe Variational Hybrid Filtering for Contact-Mode and Sparse-Law Recovery

cs.RO · 2026-05-12 · unverdicted · novelty 7.0

VHYDRO is a support-safe variational hybrid filter that jointly recovers continuous latent states, discrete contact modes, and sparse port-Hamiltonian laws per regime while preventing loss of feasible transitions.

The Gordian Knot for VLMs: Diagrammatic Knot Reasoning as a Hard Benchmark

cs.AI · 2026-05-11 · unverdicted · novelty 7.0

KnotBench benchmark shows state-of-the-art VLMs perform near random on diagrammatic knot reasoning tasks and lack ability to simulate structural moves.

ACWM-Phys: Investigating Generalized Physical Interaction in Action-Conditioned Video World Models

cs.CV · 2026-05-09 · unverdicted · novelty 7.0 · 2 refs

ACWM-Phys is a controllable simulator benchmark with in- and out-of-distribution protocols for evaluating action-conditioned world models across rigid, kinematic, deformable, and particle dynamics.

SYNCR: A Cross-Video Reasoning Benchmark with Synthetic Grounding

cs.CV · 2026-05-08 · unverdicted · novelty 7.0

SYNCR benchmark shows leading MLLMs reach only 52.5% average accuracy on cross-video reasoning tasks against an 89.5% human baseline, with major weaknesses in physical and spatial reasoning.

Learning Visual Feature-Based World Models via Residual Latent Action

cs.CV · 2026-05-08 · unverdicted · novelty 7.0

RLA-WM predicts residual latent actions via flow matching to create visual feature world models that outperform prior feature-based and diffusion approaches while enabling offline video-based robot RL.

Operator-Guided Invariance Learning for Continuous Reinforcement Learning

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

VPSD-RL discovers exact and approximate value-preserving Lie-group operators in continuous RL to stabilize learning via transition augmentation and consistency regularization.

Render, Don't Decode: Weight-Space World Models with Latent Structural Disentanglement

cs.CV · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

NOVA represents world states as INR weights for decoder-free rendering, compactness, and unsupervised disentanglement of background, foreground, and motion in video world models.

Counterfactual identifiability beyond global monotonicity: non-monotone triangular structural causal models

cs.LG · 2026-05-06 · unverdicted · novelty 7.0

Non-monotone triangular SCMs with mechanism-wise invertibility and context-independent inverse transport are equivalent to exogenous isomorphism and achieve complete counterfactual identifiability, with supporting experiments on synthetic data and MuJoCo tasks.

Latent State Design for World Models under Sufficiency Constraints

cs.AI · 2026-05-03 · unverdicted · novelty 7.0

World models succeed when their latent states are built to meet task-specific sufficiency constraints rather than preserving the maximum amount of information.

Graph World Models: Concepts, Taxonomy, and Future Directions

cs.AI · 2026-04-30 · unverdicted · novelty 7.0

The paper unifies emerging graph-based world models under a new paradigm and proposes a taxonomy organized by spatial, physical, and logical relational inductive biases.

Exploring Spatial Intelligence from a Generative Perspective

cs.CV · 2026-04-22 · unverdicted · novelty 7.0

Fine-tuning multimodal models on a new synthetic spatial benchmark improves generative spatial compliance on real and synthetic tasks and transfers to better spatial understanding.

GTASA: Ground Truth Annotations for Spatiotemporal Analysis, Evaluation and Training of Video Models

cs.CV · 2026-04-12 · unverdicted · novelty 7.0

GTASA supplies annotated multi-actor videos with exact 3D spatial and temporal ground truth that outperforms neural video generators in physical and semantic validity while enabling new probes of video encoders.

EgoTL: Egocentric Think-Aloud Chains for Long-Horizon Tasks

cs.CV · 2026-04-10 · unverdicted · novelty 7.0

EgoTL provides a new egocentric dataset with think-aloud chains and metric labels that benchmarks VLMs on long-horizon tasks and improves their planning, reasoning, and spatial grounding after finetuning.

MotionScape: A Large-Scale Real-World Highly Dynamic UAV Video Dataset for World Models

cs.CV · 2026-04-09 · unverdicted · novelty 7.0

MotionScape is a large-scale UAV video dataset with highly dynamic 6-DoF motions, geometric trajectories, and semantic annotations to train world models that better simulate complex 3D dynamics under large viewpoint changes.

citing papers explorer

Showing 50 of 141 citing papers.

TRAP: Tail-aware Ranking Attack for World-Model Planning cs.LG · 2026-05-03 · unverdicted · none · ref 17 · internal anchor
TRAP is a tail-aware ranking attack that plants a backdoor in world models so that a trigger causes the model to reorder a few critical imagined trajectories and redirect planning while preserving normal behavior on clean inputs.
Divide and Conquer: Decoupled Representation Alignment for Multimodal World Models cs.CV · 2026-05-03 · unverdicted · none · ref 11 · internal anchor
M²-REPA decouples modality-specific features inside a diffusion model and aligns each to its matching expert foundation model via an alignment loss plus a decoupling regularizer, yielding better visual quality and long-term consistency in multi-modal video generation.
RAY-TOLD: Ray-Based Latent Dynamics for Dense Dynamic Obstacle Avoidance with TDMPC cs.RO · 2026-04-30 · unverdicted · none · ref 15 · internal anchor
RAY-TOLD combines ray-based latent dynamics from LiDAR with MPPI control and a learned policy prior via mixture sampling to lower collision rates in high-density dynamic obstacle environments compared to standard MPPI.
Data-Driven Open-Loop Simulation for Digital-Twin Operator Decision Support in Wastewater Treatment cs.LG · 2026-04-22 · unverdicted · none · ref 26 · internal anchor
CCSS-RS achieves RMSE 0.696 and CRPS 0.349 at 1000-step horizons on a large public WWTP benchmark with 43% missingness, outperforming Neural CDE baselines by 40-46% in RMSE.
Xiaomi OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation cs.CV · 2026-04-20 · unverdicted · none · ref 38 · 2 links · internal anchor
OneVL achieves superior accuracy to explicit chain-of-thought reasoning at answer-only latency by supervising latent tokens with a visual world model decoder that predicts future frames.
Human Cognition in Machines: A Unified Perspective of World Models cs.RO · 2026-04-17 · unverdicted · none · ref 60 · internal anchor
The paper introduces a unified framework for world models that fully incorporates all cognitive functions from Cognitive Architecture Theory, highlights under-researched areas in motivation and meta-cognition, and proposes Epistemic World Models as a new category for scientific discovery agents.
Learning Ad Hoc Network Dynamics via Graph-Structured World Models cs.LG · 2026-04-16 · unverdicted · none · ref 2 · internal anchor
G-RSSM learns per-node dynamics in wireless ad hoc networks via graph attention and trains clustering policies through imagined rollouts, generalizing from N=50 training to larger networks.
Robotic Manipulation is Vision-to-Geometry Mapping ($f(v) \rightarrow G$): Vision-Geometry Backbones over Language and Video Models cs.RO · 2026-04-14 · unverdicted · none · ref 17 · internal anchor
Vision-geometry backbones using pretrained 3D world models outperform vision-language and video models for robotic manipulation by enabling direct mapping from visual input to geometric actions.
LMGenDrive: Bridging Multimodal Understanding and Generative World Modeling for End-to-End Driving cs.CV · 2026-04-09 · unverdicted · none · ref 17 · internal anchor
LMGenDrive unifies LLM-based multimodal understanding with generative world models to output both future driving videos and control signals for end-to-end closed-loop autonomous driving.
GIRL: Generative Imagination Reinforcement Learning via Information-Theoretic Hallucination Control cs.LG · 2026-04-08 · unverdicted · none · ref 2 · internal anchor
GIRL reduces latent rollout drift by 38-61% versus DreamerV3 in MBRL by grounding transitions with DINOv2 embeddings and using an information-theoretic adaptive bottleneck, yielding better long-horizon returns on control benchmarks.
Veo-Act: How Far Can Frontier Video Models Advance Generalizable Robot Manipulation? cs.RO · 2026-04-06 · unverdicted · none · ref 18 · internal anchor
Veo-3 video predictions enable approximate task-level robot trajectories in zero-shot settings but require hierarchical integration with low-level VLA policies for reliable manipulation performance.
Behavior-Constrained Reinforcement Learning with Receding-Horizon Credit Assignment for High-Performance Control cs.RO · 2026-04-03 · unverdicted · none · ref 12 · internal anchor
A behavior-constrained RL framework with receding-horizon credit assignment learns high-performance control policies that stay aligned with expert behavior in race car simulation.
Safety, Security, and Cognitive Risks in World Models cs.CR · 2026-04-01 · unverdicted · none · ref 1 · internal anchor
World models enable efficient AI planning but create risks from adversarial corruption, goal misgeneralization, and human bias, demonstrated via attacks that amplify errors and reduce rewards on models like RSSM and DreamerV3.
Metriplector: From Field Theory to Neural Architecture cs.AI · 2026-03-31 · unverdicted · none · ref 9 · internal anchor
Metriplector treats neural computation as coupled metriplectic field dynamics whose stress-energy tensor readout achieves competitive results on vision, control, Sudoku, language modeling, and pathfinding with small parameter counts.
Video Generation Models as World Models: Efficient Paradigms, Architectures and Algorithms eess.IV · 2026-03-30 · unverdicted · none · ref 16 · internal anchor
Video generation models can function as world simulators if efficiency gaps in spatiotemporal modeling are bridged via organized paradigms, architectures, and algorithms.
Dreamer-CDP: Improving Reconstruction-free World Models Via Continuous Deterministic Representation Prediction cs.LG · 2026-03-07 · unverdicted · none · ref 6 · internal anchor
Dreamer-CDP achieves reconstruction-free world modeling via a JEPA-style predictor on continuous deterministic representations and matches Dreamer's performance on Crafter.
HAIC: Humanoid Agile Object Interaction Control via Dynamics-Aware World Model cs.RO · 2026-02-12 · unverdicted · none · ref 10 · internal anchor
HAIC enables robust humanoid interactions with underactuated objects by predicting their dynamics from proprioceptive history and using a world model for adaptive control.
World model inspired sarcasm reasoning with large language model agents cs.CL · 2025-12-30 · unverdicted · none · ref 8 · internal anchor
WM-SAR decomposes sarcasm into LLM-agent components, quantifies literal-normative inconsistency deterministically, and integrates it with intention via logistic regression to outperform prior sarcasm detectors on benchmarks.
Learning to Feel the Future: DreamTacVLA for Contact-Rich Manipulation cs.RO · 2025-12-29 · unverdicted · none · ref 9 · internal anchor
DreamTacVLA grounds VLA models in contact physics by aligning multi-scale vision-tactile inputs and predicting future tactile states, reaching up to 95% success on contact-rich tasks.
Cambrian-S: Towards Spatial Supersensing in Video cs.CV · 2025-11-06 · unverdicted · none · ref 49 · internal anchor
Cambrian-S introduces VSI-SUPER benchmarks for long-horizon spatial recall and counting, shows data scaling yields 30% gains on existing tests, and demonstrates a self-supervised next-latent predictor using surprise outperforms baselines on the new spatial supersensing tasks.
Co-Evolving Latent Action World Models cs.LG · 2025-10-30 · unverdicted · none · ref 14 · internal anchor
CoLA-World jointly trains latent action models and world models with a warm-up phase to achieve co-evolution, matching or exceeding prior two-stage methods in video simulation quality and visual planning performance.
Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation cs.RO · 2025-08-07 · unverdicted · none · ref 13 · internal anchor
Genie Envisioner unifies robotic policy learning, simulation, and evaluation inside one instruction-conditioned video diffusion framework using GE-Base, GE-Act, and GE-Sim.
Vidar: Embodied Video Diffusion Model for Generalist Manipulation cs.LG · 2025-07-17 · unverdicted · none · ref 38 · internal anchor
Vidar shows that a video diffusion prior continuously pre-trained on 750K multi-view robot trajectories plus a label-free masked inverse dynamics adapter can generalize manipulation to new robot embodiments with 1% of typical demonstration data.
Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling cs.CV · 2025-07-10 · unverdicted · none · ref 28 · internal anchor
Geometry Forcing aligns video diffusion representations with geometric foundation model features via angular cosine and scale regression objectives to improve 3D consistency in generated videos.
V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning cs.AI · 2025-06-11 · unverdicted · none · ref 27 · internal anchor
V-JEPA 2 pre-trained on massive unlabeled video achieves strong results on motion understanding and action anticipation, SOTA video QA at 8B scale, and enables zero-shot robotic planning on Franka arms using only 62 hours of unlabeled robot video.
Physically Interpretable World Models via Weakly Supervised Representation Learning cs.LG · 2024-12-17 · unverdicted · none · ref 15 · internal anchor
PIWM aligns latent states in image-based world models with physical variables and constrains their dynamics to known equations via weak distribution supervision, yielding accurate long-horizon predictions and parameter recovery on Cart Pole, Lunar Lander, and Donkey Car.
GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation cs.RO · 2024-10-08 · unverdicted · none · ref 64 · internal anchor
GR-2 pre-trains on web-scale videos then fine-tunes on robot data to reach 97.7% average success across over 100 manipulation tasks with strong generalization to new scenes and objects.
Training Language Models to Self-Correct via Reinforcement Learning cs.LG · 2024-09-19 · unverdicted · none · ref 240 · internal anchor
SCoRe uses multi-turn online RL with regularization on self-generated traces to improve LLM self-correction, achieving 15.6% and 9.1% gains on MATH and HumanEval for Gemini models.
Reasoning with Language Model is Planning with World Model cs.CL · 2023-05-24 · unverdicted · none · ref 74 · internal anchor
RAP turns LLMs into dual world-model and planning agents via MCTS to generate better reasoning paths, outperforming CoT baselines and achieving 33% relative gains over GPT-4 CoT using LLaMA-33B on plan generation.
Learning World Graphs to Accelerate Hierarchical Reinforcement Learning cs.LG · 2019-07-01 · unverdicted · none · ref 37 · internal anchor
A two-stage framework learns a world graph of pivotal states task-agnostically via joint training of a latent model and curiosity-driven policy, then uses the graph to accelerate hierarchical RL on maze tasks.
Emergence of Exploratory Look-Around Behaviors through Active Observation Completion cs.CV · 2019-06-27 · unverdicted · none · ref 45 · internal anchor
An RL agent learns to actively explore by being rewarded for inferring unobserved scene parts after short glimpse sequences, with sidekick policy learning enabling generalization to other active perception tasks.
LVDrive: Latent Visual Representation Enhanced Vision-Language-Action Autonomous Driving Model cs.CV · 2026-05-21 · unverdicted · none · ref 14 · internal anchor
LVDrive improves closed-loop driving on Bench2Drive by adding latent future scene prediction to VLA models via unified embedding space processing and two-stage trajectory decoding.
ChronoMedicalWorld: A Medical World Model for Learning Patient Trajectories from Longitudinal Care Data cs.LG · 2026-05-21 · unverdicted · none · ref 9 · internal anchor
CMWM is a recurrent latent world model for forecasting patient trajectories like annual eGFR in CKD, reporting 7.28% lower MAE than a tuned GPT-5.5 baseline on a 2232-patient cohort with gains from dialogue data.
stable-worldmodel: A Platform for Reproducible World Modeling Research and Evaluation cs.LG · 2026-05-20 · unverdicted · none · ref 5 · internal anchor
The paper presents stable-worldmodel (swm), a platform with high-performance data layer, modern world model baselines, planning solvers, and extended environments for reproducible research and generalization evaluation.
ECG-WM: A Physiology-Informed ECG World Model for Clinical Intervention Simulation cs.AI · 2026-05-17 · unverdicted · none · ref 6 · internal anchor
ECG-WM combines ODE physiological priors with latent diffusion models to generate intervention-conditioned ECG trajectories and uses diffusion stochasticity for uncertainty-aware clinical risk assessment.
SWoMo: Neuro-Symbolic World Model for Cataract Surgery Simulation cs.CV · 2026-05-15 · conditional · none · ref 11 · internal anchor
SWoMo decouples symbolic rule-based motion modeling via scene graphs from visual realism via diffusion models, trained through inverse pairing of real cataract surgery videos reconstructed in the simulator for sim-to-real translation.
SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer cs.CV · 2026-05-14 · unverdicted · none · ref 1 · internal anchor
SANA-WM is a 2.6B-parameter efficient world model that synthesizes minute-scale 720p videos with 6-DoF camera control, trained on 213K public clips in 15 days on 64 H100s and runnable on single GPUs at 36x higher throughput than prior open baselines.
PROWL: Prioritized Regret-Driven Optimization for World Model Learning cs.LG · 2026-05-11 · unverdicted · none · ref 2 · internal anchor
PROWL introduces a KL-constrained adversarial curriculum and prioritized adversarial trajectory buffer to actively discover and correct rare failure modes in action-conditioned video world models.
Probing the Impact of Scale on Data-Efficient, Generalist Transformer World Models for Atari cs.LG · 2026-05-09 · unverdicted · none · ref 29 · internal anchor
Transformer world models on Atari exhibit game-specific scaling regimes, but joint training on 26 environments produces consistent monotonic gains that improve downstream control policies to a median normalized score of 0.770.
Reconstruction or Semantics? What Makes a Latent Space Useful for Robotic World Models cs.CV · 2026-05-07 · unverdicted · none · ref 20 · internal anchor
Semantic latent spaces from pretrained encoders outperform reconstruction-based spaces for robotic world models on planning and downstream policy performance.
CKT-WAM: Parameter-Efficient Context Knowledge Transfer Between World Action Models cs.RO · 2026-05-07 · unverdicted · none · ref 1 · internal anchor
CKT-WAM transfers teacher WAM knowledge to students via compressed text-embedding contexts using LQCA and adapters, reaching 86.1% success on LIBERO-Plus with 1.17% trainable parameters and 83.3% in real-world tasks.
FAAST: Forward-Only Associative Learning via Closed-Form Fast Weights for Test-Time Supervised Adaptation cs.LG · 2026-05-06 · unverdicted · none · ref 7 · 2 links · internal anchor
FAAST performs test-time supervised adaptation by analytically deriving fast weights from examples in one forward pass, matching backprop performance with over 90% less adaptation time and up to 95% memory savings versus memory-based methods.
HDFlow: Hierarchical Diffusion-Flow Planning for Long-horizon Tasks cs.RO · 2026-05-06 · unverdicted · none · ref 2 · 2 links · internal anchor
HDFlow pairs a high-level diffusion planner for strategic subgoals with a low-level rectified flow planner for efficient trajectories, claiming superior performance on furniture assembly and other long-horizon robotic benchmarks.
SPLICE: Latent Diffusion over JEPA Embeddings for Conformal Time-Series Inpainting cs.LG · 2026-04-30 · unverdicted · none · ref 13 · internal anchor
SPLICE couples JEPA-based latent diffusion with adaptive conformal inference to deliver accurate time-series inpainting with 93-95% empirical coverage on load datasets.
Toward a Science of Intent: Closure Gaps and Delegation Envelopes for Open-World AI Agents cs.AI · 2026-04-27 · unverdicted · none · ref 6 · internal anchor
Intent compilation turns vague human goals into verifiable artifacts, using closure-gap vectors and delegation envelopes to separate open-world agent challenges from closed-world solvers and to benchmark closure fixes against extra search.
Cortex 2.0: Grounding World Models in Real-World Industrial Deployment cs.RO · 2026-04-22 · unverdicted · none · ref 24 · internal anchor
Cortex 2.0 introduces world-model-based planning that generates and scores future trajectories to outperform reactive vision-language-action baselines on industrial robotic tasks including pick-and-place, sorting, and unpacking.
CausalVAE as a Plug-in for World Models: Towards Reliable Counterfactual Dynamics cs.LG · 2026-04-09 · unverdicted · none · ref 5 · internal anchor
CausalVAE plug-in for world models preserves factual prediction and boosts counterfactual retrieval, with large gains on physics benchmarks and recovered physical interaction trends.
Neural Computers cs.LG · 2026-04-07 · unverdicted · none · ref 12 · internal anchor
Neural Computers are introduced as a new machine form where computation, memory, and I/O are unified in a learned runtime state, with initial video-model experiments showing acquisition of basic interface primitives from traces.
Designing Digital Humans with Ambient Intelligence cs.HC · 2026-04-06 · unverdicted · none · ref 78 · internal anchor
Integrating ambient intelligence with digital humans creates context-aware virtual agents capable of anticipatory assistance based on the user's surroundings.
A Model of Understanding in Deep Learning Systems cs.AI · 2026-04-05 · unverdicted · none · ref 3 · internal anchor
Deep learning systems achieve systematic understanding through internal models tracking regularities but exhibit fractured understanding due to symbolic misalignment, lack of explicit reduction, and weak unification.

World Models

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer