ReconPhys is the first feedforward neural network that jointly reconstructs 3D geometry and appearance via Gaussian Splatting while estimating physical attributes from a single monocular video using self-supervised training.
Embodiedreamer: Advancing real2sim2real transfer for policy training via embodied world modeling.arXiv preprint arXiv:2507.05198
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
A video transfer pipeline augments simulated VLA data into realistic videos while preserving actions, yielding consistent performance gains on robot benchmarks such as 8% on Robotwin 2.0.
ECG-WM combines ODE physiological priors with latent diffusion models to generate intervention-conditioned ECG trajectories and uses diffusion stochasticity for uncertainty-aware clinical risk assessment.
SFI-Bench shows current multimodal LLMs struggle to integrate spatial memory with functional reasoning and external knowledge in video tasks.
citing papers explorer
-
ReconPhys: Reconstruct Appearance and Physical Attributes from Single Video
ReconPhys is the first feedforward neural network that jointly reconstructs 3D geometry and appearance via Gaussian Splatting while estimating physical attributes from a single monocular video using self-supervised training.
-
Seeing Realism from Simulation: Efficient Video Transfer for Vision-Language-Action Data Augmentation
A video transfer pipeline augments simulated VLA data into realistic videos while preserving actions, yielding consistent performance gains on robot benchmarks such as 8% on Robotwin 2.0.
-
ECG-WM: A Physiology-Informed ECG World Model for Clinical Intervention Simulation
ECG-WM combines ODE physiological priors with latent diffusion models to generate intervention-conditioned ECG trajectories and uses diffusion stochasticity for uncertainty-aware clinical risk assessment.
-
From Where Things Are to What They Are For: Benchmarking Spatial-Functional Intelligence in Multimodal LLMs
SFI-Bench shows current multimodal LLMs struggle to integrate spatial memory with functional reasoning and external knowledge in video tasks.