A 3D Language-Embedded Gaussians framework with opacity-aware Poisson volumetric aggregation and progressive temperature decay achieves 59.50 IoU and 21.05 mIoU on Occ-ScanNet for open-vocabulary indoor occupancy.
Lift, splat, shoot: Encoding images from arbitrary camera rigs by implicitly unprojecting to 3d
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2verdicts
UNVERDICTED 2representative citing papers
Extends online 2D multi-camera tracking to 3D via depth-based point cloud reconstruction, clustering for 3D boxes, and local ID consistency for global data association, placing 3rd on 2025 AI City Challenge 3D MTMC dataset.
citing papers explorer
-
Monocular Open Vocabulary Occupancy Prediction for Indoor Scenes
A 3D Language-Embedded Gaussians framework with opacity-aware Poisson volumetric aggregation and progressive temperature decay achieves 59.50 IoU and 21.05 mIoU on Occ-ScanNet for open-vocabulary indoor occupancy.
-
Online 3D Multi-Camera Perception through Robust 2D Tracking and Depth-based Late Aggregation
Extends online 2D multi-camera tracking to 3D via depth-based point cloud reconstruction, clustering for 3D boxes, and local ID consistency for global data association, placing 3rd on 2025 AI City Challenge 3D MTMC dataset.