ViSRA boosts MLLM 3D spatial reasoning performance by up to 28.9% on unseen tasks via a plug-and-play video-based agent that extracts explicit spatial cues from expert models without any post-training.
Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography.Communications of the ACM, 24(6):381–395
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 6roles
method 3polarities
use method 3representative citing papers
Sparkle supplies a large-scale dataset and benchmark for instruction-driven video background replacement, enabling models that generate more natural and temporally consistent new scenes than earlier approaches.
SceneAligner grounds floorplan localization in a reconstructed 3D scene by creating a density-map proxy and learning to align it with rasterized floorplans via a fine-tuned 2D foundation model.
SplitGS-Loc disambiguates 2D-3D correspondences in photometrically optimized GSFFs via Mixture-of-Gaussians splitting and multi-view consistency selection, yielding stable PnP and SOTA localization results.
Projecting 3D LiDAR to BEV images and applying YOLO-OBB with spatiotemporal fusion enables reliable real-time structural detection on resource-constrained robots.
citing papers explorer
No citing papers match the current filters.