pith. sign in

Planning with reasoning using vision language world model

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

citation-role summary

background 2

citation-polarity summary

years

2026 7 2025 2

verdicts

UNVERDICTED 9

roles

background 2

polarities

background 2

representative citing papers

RECIPE: Procedural Planning via Grounding in Instructional Video

cs.CV · 2026-05-19 · unverdicted · novelty 7.0

RECIPE improves visual procedural planners by rewarding plans according to their grounding quality in ASR transcripts via GRPO, yielding +7–8 in-domain and up to +16 zero-shot macro-accuracy gains over base models and outperforming supervised fine-tuning on seven benchmarks.

GeoWorld: Geometric World Models

cs.CV · 2026-02-26 · unverdicted · novelty 6.0

GeoWorld applies hyperbolic geometry to JEPA world models and introduces geometric reinforcement learning, reporting modest success-rate gains of ~3% and ~2% on 3- and 4-step planning tasks versus V-JEPA 2.

World Simulation with Video Foundation Models for Physical AI

cs.CV · 2025-10-28 · unverdicted · novelty 4.0

Cosmos-Predict2.5 unifies text-to-world, image-to-world, and video-to-world generation in one model trained on 200M clips with RL post-training, delivering improved quality and control for physical AI.

citing papers explorer

Showing 9 of 9 citing papers.