hub

ShapeNet: An Information-Rich 3D Model Repository

Angel X. Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li · 2015 · cs.GR · arXiv 1512.03012

50 Pith papers cite this work. Polarity classification is still indexing.

50 Pith papers citing it

open full Pith review browse 50 citing papers arXiv PDF

abstract

We present ShapeNet: a richly-annotated, large-scale repository of shapes represented by 3D CAD models of objects. ShapeNet contains 3D models from a multitude of semantic categories and organizes them under the WordNet taxonomy. It is a collection of datasets providing many semantic annotations for each 3D model such as consistent rigid alignments, parts and bilateral symmetry planes, physical sizes, keywords, as well as other planned annotations. Annotations are made available through a public web-based interface to enable data visualization of object attributes, promote data-driven geometric analysis, and provide a large-scale quantitative benchmark for research in computer graphics and vision. At the time of this technical report, ShapeNet has indexed more than 3,000,000 models, 220,000 models out of which are classified into 3,135 categories (WordNet synsets). In this report we describe the ShapeNet effort as a whole, provide details for all currently available datasets, and summarize future plans.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 1

citation-polarity summary

background 1

claims ledger

abstract We present ShapeNet: a richly-annotated, large-scale repository of shapes represented by 3D CAD models of objects. ShapeNet contains 3D models from a multitude of semantic categories and organizes them under the WordNet taxonomy. It is a collection of datasets providing many semantic annotations for each 3D model such as consistent rigid alignments, parts and bilateral symmetry planes, physical sizes, keywords, as well as other planned annotations. Annotations are made available through a public web-based interface to enable data visualization of object attributes, promote data-driven geometri

co-cited works

representative citing papers

Towards Realistic 3D Emission Materials: Dataset, Baseline, and Evaluation for Emission Texture Generation

cs.CV · 2026-04-13 · unverdicted · novelty 8.0

The work creates the first dataset and baseline for generating emission textures on 3D objects to reproduce glowing materials from input images.

Min Generalized Sliced Gromov Wasserstein: A Scalable Path to Gromov Wasserstein

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

min-GSGW learns coupled nonlinear slicers to produce a rigid-motion-invariant, scalable approximation to the Gromov-Wasserstein distance and its transport plans.

Img2CADSeq: Image-to-CAD Generation via Sequence-Based Diffusion

cs.CV · 2026-05-13 · unverdicted · novelty 7.0

Img2CADSeq generates standard CAD sequences from images via a multi-stage pipeline with three-level hierarchical codebook encoding, importance-guided compression, and contrastive point-cloud conditioning of a VQ-Diffusion model, outperforming prior methods on new CAD-220K and PrintCAD datasets.

Count Anything at Any Granularity

cs.CV · 2026-05-11 · unverdicted · novelty 7.0

Multi-grained counting is introduced with five granularity levels, supported by the new KubriCount dataset generated via 3D synthesis and editing, and HieraCount model that combines text and visual exemplars for improved accuracy.

The Wittgensteinian Representation Hypothesis: Is Language the Attractor of Multimodal Convergence?

cs.AI · 2026-05-10 · unverdicted · novelty 7.0

Language representations serve as the asymptotic attractor for convergence in independently trained multimodal neural networks due to feature density asymmetry.

MeshFIM: Local Low-Poly Mesh Editing via Fill-in-the-Middle Autoregressive Generation

cs.GR · 2026-05-09 · unverdicted · novelty 7.0

MeshFIM enables local low-poly mesh editing by autoregressively filling target regions conditioned on context, using boundary markers, positional embeddings, and a gated geometry encoder to enforce attachment, topology, and region limits.

Rollback-Free Stable Brick Structures Generation

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

Reinforcement learning internalizes physical stability rules for brick structures, enabling the first rollback-free generation with orders-of-magnitude faster inference.

Two Steps Are All You Need: Efficient 3D Point Cloud Anomaly Detection with Consistency Models

cs.CV · 2026-05-06 · unverdicted · novelty 7.0

Consistency learning reformulates 3D point cloud anomaly detection to predict clean geometry directly in one or two steps, yielding up to 80 times faster inference while matching state-of-the-art accuracy.

ADS: Random Sampling of Occupancy Functions using Adaptive Delaunay Scaffolding

cs.GR · 2026-05-05 · unverdicted · novelty 7.0

ADS adaptively refines a Delaunay scaffold to produce unbiased random samples on occupancy function surfaces together with a connecting mesh, using far fewer evaluations than existing approaches.

Generative Modeling with Orbit-Space Particle Flow Matching

cs.GR · 2026-05-04 · unverdicted · novelty 7.0

OGPP is a particle flow-matching method using orbit-space canonicalization and geometric paths that achieves lower error and fewer steps than prior approaches on 3D benchmarks.

AirZoo: A Unified Large-Scale Dataset for Grounding Aerial Geometric 3D Vision

cs.CV · 2026-04-29 · conditional · novelty 7.0

AirZoo is a new large-scale synthetic dataset for aerial 3D vision that improves state-of-the-art models on image retrieval, cross-view matching, and 3D reconstruction when used for fine-tuning.

Topo-ADV: Generating Topology-Driven Imperceptible Adversarial Point Clouds

cs.CV · 2026-04-10 · unverdicted · novelty 7.0

Topo-ADV uses differentiable persistent homology to create topology-altering perturbations that achieve up to 100% attack success on point cloud classifiers like PointNet while remaining geometrically imperceptible.

Training-free Spatially Grounded Geometric Shape Encoding (Technical Report)

cs.CV · 2026-04-08 · unverdicted · novelty 7.0 · 2 refs

XShapeEnc encodes arbitrary 2D spatially grounded shapes into compact invertible representations by decomposing them into unit-disk geometry and harmonic pose fields then applying Zernike bases with frequency propagation.

3D-Fixer: Coarse-to-Fine In-place Completion for 3D Scenes from a Single Image

cs.CV · 2026-04-06 · unverdicted · novelty 7.0

3D-Fixer performs in-place 3D asset completion from single-view partial point clouds via coarse-to-fine generation with ORFA conditioning, plus a new ARSG-110K dataset, to achieve higher geometric accuracy than MIDI and Gen3DSR while keeping diffusion efficiency.

Deformation-based In-Context Learning for Point Cloud Understanding

cs.CV · 2026-04-03 · unverdicted · novelty 7.0

DeformPIC deforms query point clouds under prompt guidance for in-context learning, outperforming prior methods with lower Chamfer Distance on reconstruction, denoising, and registration tasks.

Fast Graph Representation Learning with PyTorch Geometric

cs.LG · 2019-03-06 · accept · novelty 7.0

PyTorch Geometric is a PyTorch library that delivers fast graph neural network training through sparse GPU kernels and variable-size mini-batching.

ObjView-Bench: Rethinking Difficulty and Deployment for Object-Centric View Planning

cs.RO · 2026-05-11 · unverdicted · novelty 6.0

ObjView-Bench disentangles omnidirectional self-occlusion, saturation difficulty, and set-cover planning difficulty, then shows that budget regimes and reachable-view constraints change planner rankings and failure modes across classical, learned, and hybrid methods.

GenMed: A Pairwise Generative Reformulation of Medical Diagnostic Tasks

cs.CV · 2026-05-11 · unverdicted · novelty 6.0

GenMed uses diffusion models to capture P(X,Y) for medical tasks and performs inference via gradient-based test-time optimization, supporting arbitrary observation combinations without retraining.

Beyond Spatial Compression: Interface-Centric Generative States for Open-World 3D Structure

cs.LG · 2026-05-11 · unverdicted · novelty 6.0

C2LT-3D factorizes 3D tokenization into canonical local geometry, partition-conditioned context, and relational seam variables to make latent states operational for assembly-level validation and repair in open-world multi-component assets.

Minimax Optimal Estimation of Transport-Growth Pairs in Unbalanced Optimal Transport

math.ST · 2026-05-09 · unverdicted · novelty 6.0

Estimators for transport-growth pairs in unbalanced OT achieve minimax optimal rates, supported by a value-based stability reduction through a UOT gap condition.

Toward Visually Realistic Simulation: A Benchmark for Evaluating Robot Manipulation in Simulation

cs.RO · 2026-05-07 · unverdicted · novelty 6.0

VISER is a new visually realistic simulation benchmark for robot manipulation tasks that uses PBR materials and MLLM-assisted asset generation, achieving 0.92 Pearson correlation with real-world policy performance.

Prop-Chromeleon: Adaptive Haptic Props in Mixed Reality through Generative Artificial Intelligence

cs.HC · 2026-05-01 · unverdicted · novelty 6.0

A generative-AI pipeline dynamically generates and anchors virtual assets to match the shape of physical props, enabling adaptive passive haptics in MR that users rate higher in realism, immersion, and enjoyment than static baselines.

TAFA-GSGC: Group-wise Scalable Point Cloud Geometry Compression with Progressive Residual Refinement

cs.CV · 2026-04-30 · unverdicted · novelty 6.0 · 2 refs

TAFA-GSGC is a scalable point cloud geometry compression codec using progressive residual refinement and group-wise entropy coding that achieves average BD-rate reductions of 4.99% (D1-PSNR) and 5.92% (D2-PSNR) over PCGCv2 while supporting monotonic multi-quality decoding from a single bitstream.

ShapeY: A Principled Framework for Measuring Shape Recognition Capacity via Nearest-Neighbor Matching

cs.CV · 2026-04-27 · unverdicted · novelty 6.0

ShapeY is a benchmark dataset and nearest-neighbor protocol that measures shape-based recognition in vision models, revealing that even state-of-the-art networks fail to generalize consistently across 3D viewpoints and non-shape appearance changes.

citing papers explorer

Showing 50 of 50 citing papers.

Towards Realistic 3D Emission Materials: Dataset, Baseline, and Evaluation for Emission Texture Generation cs.CV · 2026-04-13 · unverdicted · none · ref 3 · internal anchor
The work creates the first dataset and baseline for generating emission textures on 3D objects to reproduce glowing materials from input images.
Min Generalized Sliced Gromov Wasserstein: A Scalable Path to Gromov Wasserstein cs.LG · 2026-05-13 · unverdicted · none · ref 2 · internal anchor
min-GSGW learns coupled nonlinear slicers to produce a rigid-motion-invariant, scalable approximation to the Gromov-Wasserstein distance and its transport plans.
Img2CADSeq: Image-to-CAD Generation via Sequence-Based Diffusion cs.CV · 2026-05-13 · unverdicted · none · ref 27 · internal anchor
Img2CADSeq generates standard CAD sequences from images via a multi-stage pipeline with three-level hierarchical codebook encoding, importance-guided compression, and contrastive point-cloud conditioning of a VQ-Diffusion model, outperforming prior methods on new CAD-220K and PrintCAD datasets.
Count Anything at Any Granularity cs.CV · 2026-05-11 · unverdicted · none · ref 14 · internal anchor
Multi-grained counting is introduced with five granularity levels, supported by the new KubriCount dataset generated via 3D synthesis and editing, and HieraCount model that combines text and visual exemplars for improved accuracy.
The Wittgensteinian Representation Hypothesis: Is Language the Attractor of Multimodal Convergence? cs.AI · 2026-05-10 · unverdicted · none · ref 45 · internal anchor
Language representations serve as the asymptotic attractor for convergence in independently trained multimodal neural networks due to feature density asymmetry.
MeshFIM: Local Low-Poly Mesh Editing via Fill-in-the-Middle Autoregressive Generation cs.GR · 2026-05-09 · unverdicted · none · ref 30 · internal anchor
MeshFIM enables local low-poly mesh editing by autoregressively filling target regions conditioned on context, using boundary markers, positional embeddings, and a gated geometry encoder to enforce attachment, topology, and region limits.
Rollback-Free Stable Brick Structures Generation cs.LG · 2026-05-07 · unverdicted · none · ref 2 · internal anchor
Reinforcement learning internalizes physical stability rules for brick structures, enabling the first rollback-free generation with orders-of-magnitude faster inference.
Two Steps Are All You Need: Efficient 3D Point Cloud Anomaly Detection with Consistency Models cs.CV · 2026-05-06 · unverdicted · none · ref 4 · internal anchor
Consistency learning reformulates 3D point cloud anomaly detection to predict clean geometry directly in one or two steps, yielding up to 80 times faster inference while matching state-of-the-art accuracy.
ADS: Random Sampling of Occupancy Functions using Adaptive Delaunay Scaffolding cs.GR · 2026-05-05 · unverdicted · none · ref 7 · internal anchor
ADS adaptively refines a Delaunay scaffold to produce unbiased random samples on occupancy function surfaces together with a connecting mesh, using far fewer evaluations than existing approaches.
Generative Modeling with Orbit-Space Particle Flow Matching cs.GR · 2026-05-04 · unverdicted · none · ref 18 · internal anchor
OGPP is a particle flow-matching method using orbit-space canonicalization and geometric paths that achieves lower error and fewer steps than prior approaches on 3D benchmarks.
AirZoo: A Unified Large-Scale Dataset for Grounding Aerial Geometric 3D Vision cs.CV · 2026-04-29 · conditional · none · ref 6 · internal anchor
AirZoo is a new large-scale synthetic dataset for aerial 3D vision that improves state-of-the-art models on image retrieval, cross-view matching, and 3D reconstruction when used for fine-tuning.
Topo-ADV: Generating Topology-Driven Imperceptible Adversarial Point Clouds cs.CV · 2026-04-10 · unverdicted · none · ref 6 · internal anchor
Topo-ADV uses differentiable persistent homology to create topology-altering perturbations that achieve up to 100% attack success on point cloud classifiers like PointNet while remaining geometrically imperceptible.
Training-free Spatially Grounded Geometric Shape Encoding (Technical Report) cs.CV · 2026-04-08 · unverdicted · none · ref 4 · 2 links · internal anchor
XShapeEnc encodes arbitrary 2D spatially grounded shapes into compact invertible representations by decomposing them into unit-disk geometry and harmonic pose fields then applying Zernike bases with frequency propagation.
3D-Fixer: Coarse-to-Fine In-place Completion for 3D Scenes from a Single Image cs.CV · 2026-04-06 · unverdicted · none · ref 4 · internal anchor
3D-Fixer performs in-place 3D asset completion from single-view partial point clouds via coarse-to-fine generation with ORFA conditioning, plus a new ARSG-110K dataset, to achieve higher geometric accuracy than MIDI and Gen3DSR while keeping diffusion efficiency.
Deformation-based In-Context Learning for Point Cloud Understanding cs.CV · 2026-04-03 · unverdicted · none · ref 5 · internal anchor
DeformPIC deforms query point clouds under prompt guidance for in-context learning, outperforming prior methods with lower Chamfer Distance on reconstruction, denoising, and registration tasks.
Fast Graph Representation Learning with PyTorch Geometric cs.LG · 2019-03-06 · accept · none · ref 9 · internal anchor
PyTorch Geometric is a PyTorch library that delivers fast graph neural network training through sparse GPU kernels and variable-size mini-batching.
ObjView-Bench: Rethinking Difficulty and Deployment for Object-Centric View Planning cs.RO · 2026-05-11 · unverdicted · none · ref 5 · internal anchor
ObjView-Bench disentangles omnidirectional self-occlusion, saturation difficulty, and set-cover planning difficulty, then shows that budget regimes and reachable-view constraints change planner rankings and failure modes across classical, learned, and hybrid methods.
GenMed: A Pairwise Generative Reformulation of Medical Diagnostic Tasks cs.CV · 2026-05-11 · unverdicted · none · ref 47 · internal anchor
GenMed uses diffusion models to capture P(X,Y) for medical tasks and performs inference via gradient-based test-time optimization, supporting arbitrary observation combinations without retraining.
Beyond Spatial Compression: Interface-Centric Generative States for Open-World 3D Structure cs.LG · 2026-05-11 · unverdicted · none · ref 1 · internal anchor
C2LT-3D factorizes 3D tokenization into canonical local geometry, partition-conditioned context, and relational seam variables to make latent states operational for assembly-level validation and repair in open-world multi-component assets.
Minimax Optimal Estimation of Transport-Growth Pairs in Unbalanced Optimal Transport math.ST · 2026-05-09 · unverdicted · none · ref 6 · internal anchor
Estimators for transport-growth pairs in unbalanced OT achieve minimax optimal rates, supported by a value-based stability reduction through a UOT gap condition.
Toward Visually Realistic Simulation: A Benchmark for Evaluating Robot Manipulation in Simulation cs.RO · 2026-05-07 · unverdicted · none · ref 4 · internal anchor
VISER is a new visually realistic simulation benchmark for robot manipulation tasks that uses PBR materials and MLLM-assisted asset generation, achieving 0.92 Pearson correlation with real-world policy performance.
Prop-Chromeleon: Adaptive Haptic Props in Mixed Reality through Generative Artificial Intelligence cs.HC · 2026-05-01 · unverdicted · none · ref 10 · internal anchor
A generative-AI pipeline dynamically generates and anchors virtual assets to match the shape of physical props, enabling adaptive passive haptics in MR that users rate higher in realism, immersion, and enjoyment than static baselines.
TAFA-GSGC: Group-wise Scalable Point Cloud Geometry Compression with Progressive Residual Refinement cs.CV · 2026-04-30 · unverdicted · none · ref 24 · 2 links · internal anchor
TAFA-GSGC is a scalable point cloud geometry compression codec using progressive residual refinement and group-wise entropy coding that achieves average BD-rate reductions of 4.99% (D1-PSNR) and 5.92% (D2-PSNR) over PCGCv2 while supporting monotonic multi-quality decoding from a single bitstream.
ShapeY: A Principled Framework for Measuring Shape Recognition Capacity via Nearest-Neighbor Matching cs.CV · 2026-04-27 · unverdicted · none · ref 32 · internal anchor
ShapeY is a benchmark dataset and nearest-neighbor protocol that measures shape-based recognition in vision models, revealing that even state-of-the-art networks fail to generalize consistently across 3D viewpoints and non-shape appearance changes.
Point-MF: One-step Point Cloud Generation from a Single Image via Mean Flows cs.CV · 2026-04-27 · unverdicted · none · ref 5 · internal anchor
Point-MF performs one-step point cloud reconstruction from single images by learning a mean velocity field in point space with a tailored Diffusion Transformer and a new auxiliary loss.
Text-Guided Multimodal Unified Industrial Anomaly Detection cs.CV · 2026-04-24 · unverdicted · none · ref 43 · internal anchor
A text-semantics-guided multimodal framework with geometry-aware mapping and object-conditioned text adaptation achieves state-of-the-art unsupervised anomaly detection and localization on RGB-3D industrial datasets while enabling a single model for multiple classes.
FILTR: Extracting Topological Features from Pretrained 3D Models cs.CV · 2026-04-24 · unverdicted · none · ref 8 · internal anchor
FILTR predicts persistence diagrams from pretrained 3D encoders on the new DONUT benchmark, showing limited topological signals in encoders but successful approximation via learnable feed-forward.
FurnSet: Exploiting Repeats for 3D Scene Reconstruction cs.CV · 2026-04-22 · unverdicted · none · ref 3 · internal anchor
FurnSet improves single-view 3D scene reconstruction by using per-object CLS tokens and set-aware self-attention to group and jointly reconstruct repeated object instances, with added scene-object conditioning and layout optimization.
Volume Transformer: Revisiting Vanilla Transformers for 3D Scene Understanding cs.CV · 2026-04-21 · unverdicted · none · ref 60 · internal anchor
A minimally modified vanilla Transformer called Volt achieves state-of-the-art 3D semantic and instance segmentation by using volumetric tokens, 3D rotary embeddings, and a data-efficient training recipe that scales better than domain-specific backbones.
One-Shot Cross-Geometry Skill Transfer through Part Decomposition cs.RO · 2026-04-16 · unverdicted · none · ref 9 · internal anchor
Part decomposition with generative shape models allows one-shot robot skill transfer across unfamiliar object geometries in simulation and real settings.
Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective cs.CV · 2026-04-15 · unverdicted · none · ref 290 · internal anchor
The paper proposes a problem-driven taxonomy for feed-forward 3D scene modeling that groups methods by five core challenges: feature enhancement, geometry awareness, model efficiency, augmentation strategies, and temporal-aware modeling.
ReplicateAnyScene: Zero-Shot Video-to-3D Composition via Textual-Visual-Spatial Alignment cs.CV · 2026-04-12 · unverdicted · none · ref 9 · internal anchor
ReplicateAnyScene performs fully automated zero-shot video-to-compositional-3D reconstruction by cascading alignments of generic priors from vision foundation models across textual, visual, and spatial dimensions.
L-PCN: A Point Cloud Accelerator Exploiting Spatial Locality through Octree-based Islandization cs.AR · 2026-04-12 · unverdicted · none · ref 4 · internal anchor
L-PCN exploits spatial locality in point cloud networks via octree partitioning into islands and intra-island hub scheduling, delivering 55-94% less feature fetching, 45-81% less computation, and 1.2-3.2x additional speedup on FPGA prototypes.
TouchAnything: Diffusion-Guided 3D Reconstruction from Sparse Robot Touches cs.CV · 2026-04-10 · unverdicted · none · ref 5 · internal anchor
TouchAnything reconstructs accurate 3D object geometries from only a few tactile contacts by optimizing for consistency with a pretrained visual diffusion prior.
Part-Level 3D Gaussian Vehicle Generation with Joint and Hinge Axis Estimation cs.AI · 2026-04-06 · unverdicted · none · ref 1 · internal anchor
A new framework generates part-level animatable 3D Gaussian vehicles from images by adding modules for exclusive part ownership and kinematic joint/axis prediction.
FusionBERT: Multi-View Image-3D Retrieval via Cross-Attention Visual Fusion and Normal-Aware 3D Encoder cs.CV · 2026-04-02 · unverdicted · none · ref 2 · internal anchor
FusionBERT uses cross-attention to fuse multi-view images and a normal-aware encoder for 3D models, achieving higher image-3D retrieval accuracy than prior multimodal models in both single- and multi-view settings.
SAM 3D: 3Dfy Anything in Images cs.CV · 2025-11-20 · unverdicted · none · ref 4 · internal anchor
SAM 3D reconstructs 3D objects from single images with geometry, texture, and pose using human-model annotated data at scale and synthetic-to-real training, achieving 5:1 human preference wins.
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis cs.CV · 2024-09-03 · unverdicted · none · ref 37 · internal anchor
ViewCrafter tames video diffusion models with point-based 3D guidance and iterative trajectory planning to produce high-fidelity novel views from single or sparse images.
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset cs.RO · 2024-03-19 · accept · none · ref 5 · internal anchor
DROID is a new 76k-trajectory in-the-wild robot manipulation dataset spanning 564 scenes and 84 tasks that improves policy performance and generalization when used for training.
EvObj: Learning Evolving Object-centric Representations for 3D Instance Segmentation without Scene Supervision cs.CV · 2026-05-13 · unverdicted · none · ref 5 · internal anchor
EvObj learns evolving object-centric representations for unsupervised 3D instance segmentation by dynamically refining object candidates and completing partial geometries to bridge the synthetic-to-real domain gap, outperforming baselines on real and synthetic datasets.
Syn4D: A Multiview Synthetic 4D Dataset cs.CV · 2026-05-06 · unverdicted · none · ref 18 · internal anchor
Syn4D is a new multiview synthetic 4D dataset supplying dense ground-truth annotations for dynamic scene reconstruction, tracking, and human pose estimation.
Channel-Level Relation to Attentive Aggregation with Neighborhood-Homogeneity Constraint for Point Cloud Analysis cs.CV · 2026-05-04 · unverdicted · none · ref 34 · 2 links · internal anchor
PointCRA reduces information loss in deep point cloud networks by treating temporal trend variation as an extra evaluation dimension alongside spatial and channel attention, guided by a neighborhood homogeneity constraint.
From Visual Synthesis to Interactive Worlds: Toward Production-Ready 3D Asset Generation cs.GR · 2026-04-26 · unverdicted · none · ref 42 · 2 links · internal anchor
The paper surveys 3D asset generation methods and organizes them around the full production pipeline to assess which outputs meet engine-level requirements for interactive applications.
AmaraSpatial-10K: A Spatially and Semantically Aligned 3D Dataset for Spatial Computing and Embodied AI cs.CV · 2026-04-24 · unverdicted · none · ref 1 · 2 links · internal anchor
AmaraSpatial-10K supplies 10K deployment-ready 3D assets with metric scaling and metadata, delivering 3.4x higher CLIP Recall@5 than Objaverse and 99.1% physics stability in Habitat-Sim.
Unposed-to-3D: Learning Simulation-Ready Vehicles from Real-World Images cs.CV · 2026-04-21 · unverdicted · none · ref 3 · internal anchor
Unposed-to-3D learns simulation-ready 3D vehicle models from unposed real images by predicting camera parameters for photometric self-supervision, then adding scale prediction and harmonization.
Neural Distribution Prior for LiDAR Out-of-Distribution Detection cs.CV · 2026-04-10 · unverdicted · none · ref 10 · internal anchor
NDP models prediction distributions and uses Perlin noise OOD synthesis to reach 61.31% point-level AP on STU LiDAR benchmark, over 10x prior best.
RETO: A Rotary-Enhanced Transformer Operator for High-Fidelity Prediction of Automotive Aerodynamics eess.IV · 2026-04-30 · unverdicted · none · ref 17 · internal anchor
RETO achieves relative L2 errors of 0.063 on ShapeNet and 0.089/0.097 on DrivAerML surface pressure/velocity, outperforming Transolver and other baselines.
Reinforcing 3D Understanding in Point-VLMs via Geometric Reward Credit Assignment cs.CV · 2026-04-23 · unverdicted · none · ref 5 · internal anchor
Geometric Reward Credit Assignment disentangles rewards to geometric tokens and adds reprojection consistency to boost 3D keypoint accuracy from 0.64 to 0.93 and bounding box IoU to 0.686 on a ShapeNetCore benchmark while preserving 2D performance.
Attention Is not Everything: Efficient Alternatives for Vision cs.CV · 2026-04-19 · unverdicted · none · ref 34 · internal anchor
A survey that taxonomizes non-Transformer vision models and evaluates their practical trade-offs across efficiency, scalability, and robustness.
3D Generation for Embodied AI and Robotic Simulation: A Survey cs.RO · 2026-04-29 · unverdicted · none · ref 194 · 3 links · internal anchor
The paper surveys 3D generation techniques for embodied AI and robotics, categorizing them into data generation, simulation environments, and sim-to-real bridging while identifying bottlenecks in physical validity and transfer.

ShapeNet: An Information-Rich 3D Model Repository

hub tools

citation-role summary

citation-polarity summary

claims ledger

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer