pith. sign in

Mixed citations

Vid2World: Crafting video diffusion models to interactive world models.arXiv preprint arXiv:2505.14357, 2025

Mixed citation behavior. Most common role is background (67%).

13 Pith papers citing it
Background 67% of classified citations

citation-role summary

background 5 baseline 1

citation-polarity summary

years

2026 12 2025 1

representative citing papers

DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos

cs.RO · 2026-02-06 · unverdicted · novelty 7.0

DreamDojo is a foundation world model pretrained on the largest human video dataset to date that uses continuous latent actions to transfer interaction knowledge and achieves controllable physics simulation after robot post-training.

PanoWorld: Geometry-Consistent Panoramic Video World Modeling

cs.CV · 2026-05-14 · unverdicted · novelty 6.0

PanoWorld adds depth consistency and trajectory consistency losses plus spherical adaptations to a pre-trained video model, plus a new PanoGeo dataset, to produce geometry-consistent 360 video.

Diffusion Model as a Generalist Segmentation Learner

cs.CV · 2026-04-27 · unverdicted · novelty 6.0

DiGSeg repurposes diffusion U-Nets as generalist segmentation learners by conditioning on image-mask latents and multi-scale CLIP text features, achieving strong cross-domain performance.

Co-Evolving Latent Action World Models

cs.LG · 2025-10-30 · unverdicted · novelty 6.0

CoLA-World jointly trains latent action models and world models with a warm-up phase to achieve co-evolution, matching or exceeding prior two-stage methods in video simulation quality and visual planning performance.

WorldString: Actionable World Representation

cs.AI · 2026-05-18 · unverdicted · novelty 4.0 · 2 refs

Proposes WorldString, a differentiable neural model for the state manifold of actionable physical objects learned directly from 3D or video data as a building block for world models.

citing papers explorer

Showing 13 of 13 citing papers.