pith. sign in

3d-r1: Enhancing reasoning in 3d vlms for unified scene understanding

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

citation-role summary

background 3

citation-polarity summary

fields

cs.CV 9 cs.CL 1

years

2026 9 2025 1

roles

background 3

polarities

background 3

clear filters

representative citing papers

Token Warping Helps MLLMs Look from Nearby Viewpoints

cs.CV · 2026-04-03 · unverdicted · novelty 7.0

Backward token warping in ViT-based MLLMs enables reliable reasoning from nearby viewpoints by preserving semantic coherence better than pixel-wise warping or fine-tuning baselines.

Distilling Neuro-Symbolic Programs into 3D Multi-modal LLMs

cs.CV · 2026-05-31 · unverdicted · novelty 6.0 · 2 refs

APEIRIA distills neuro-symbolic 3D reasoning programs into 3D MLLMs through a curriculum that transfers stepwise verification patterns to achieve transparent yet flexible spatial reasoning.

GeoWorld: Geometric World Models

cs.CV · 2026-02-26 · unverdicted · novelty 6.0

GeoWorld applies hyperbolic geometry to JEPA world models and introduces geometric reinforcement learning, reporting modest success-rate gains of ~3% and ~2% on 3- and 4-step planning tasks versus V-JEPA 2.

Grounded 3D-Aware Spatial Vision-Language Modeling

cs.CV · 2026-05-28 · unverdicted · novelty 5.0

GR3D is a VLM that combines explicit 2D, implicit 2D, and monocular 3D grounding mechanisms to improve performance on spatial understanding benchmarks.

UniMesh: Unifying 3D Mesh Understanding and Generation

cs.CV · 2026-04-19 · unverdicted · novelty 5.0

UniMesh unifies 3D mesh generation and understanding in one model via a Mesh Head interface, Chain of Mesh iterative editing, and an Actor-Evaluator self-reflection loop.

citing papers explorer

Showing 1 of 1 citing paper after filters.