VLMs possess a latent 3D scene topology subspace corresponding to Laplacian eigenmaps that can be causally shaped via Dirichlet energy regularization to improve spatial task performance by up to 12.1%.
Infinibench: Infinite benchmarking for visual spatial reasoning with customizable scene complexity
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
ProcFunc introduces a Python library with function-oriented abstractions for procedural 3D generation in Blender, enabling combinatorial scene creation and demonstrated via a new indoor room generator with compositional materials for synthetic data.
LychSim introduces a controllable simulation platform on Unreal Engine 5 with Python API, procedural generation, and LLM integration for vision research tasks.
citing papers explorer
-
Uncovering and Shaping the Latent Representation of 3D Scene Topology in Vision-Language Models
VLMs possess a latent 3D scene topology subspace corresponding to Laplacian eigenmaps that can be causally shaped via Dirichlet energy regularization to improve spatial task performance by up to 12.1%.
-
ProcFunc: Function-Oriented Abstractions for Procedural 3D Generation in Python
ProcFunc introduces a Python library with function-oriented abstractions for procedural 3D generation in Blender, enabling combinatorial scene creation and demonstrated via a new indoor room generator with compositional materials for synthetic data.
-
LychSim: A Controllable and Interactive Simulation Framework for Vision Research
LychSim introduces a controllable simulation platform on Unreal Engine 5 with Python API, procedural generation, and LLM integration for vision research tasks.