A multimodal pipeline decodes EEG into 3D meshes via EEG-to-image, MLLM reasoning, diffusion, and single-image-to-3D conversion, reporting 85.4% 10-way accuracy and 0.648 CLIPScore.
In: European Conference on Computer Vision
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 3years
2026 3roles
background 1polarities
background 1representative citing papers
LSRM scales transformer context windows with native sparse attention and geometric routing to deliver high-fidelity feed-forward 3D reconstruction and inverse rendering that approaches dense optimization quality.
PAD synthesizes 3D geometry in observation space via depth unprojection as anchor to eliminate pose ambiguity in image-to-3D generation.
citing papers explorer
-
Brain3D: EEG-to-3D Decoding of Visual Representations via Multimodal Reasoning
A multimodal pipeline decodes EEG into 3D meshes via EEG-to-image, MLLM reasoning, diffusion, and single-image-to-3D conversion, reporting 85.4% 10-way accuracy and 0.648 CLIPScore.
-
LSRM: High-Fidelity Object-Centric Reconstruction via Scaled Context Windows
LSRM scales transformer context windows with native sparse attention and geometric routing to deliver high-fidelity feed-forward 3D reconstruction and inverse rendering that approaches dense optimization quality.
-
Pose-Aware Diffusion for 3D Generation
PAD synthesizes 3D geometry in observation space via depth unprojection as anchor to eliminate pose ambiguity in image-to-3D generation.