super hub Mixed citations

ShapeNet: An Information-Rich 3D Model Repository

Angel X. Chang, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Thomas Funkhouser, Zimo Li · 2015 · cs.GR · arXiv 1512.03012

Mixed citation behavior. Most common role is background (57%).

145 Pith papers citing it

Background 57% of classified citations

open full Pith review browse 145 citing papers more from Angel X. Chang arXiv PDF

abstract

We present ShapeNet: a richly-annotated, large-scale repository of shapes represented by 3D CAD models of objects. ShapeNet contains 3D models from a multitude of semantic categories and organizes them under the WordNet taxonomy. It is a collection of datasets providing many semantic annotations for each 3D model such as consistent rigid alignments, parts and bilateral symmetry planes, physical sizes, keywords, as well as other planned annotations. Annotations are made available through a public web-based interface to enable data visualization of object attributes, promote data-driven geometric analysis, and provide a large-scale quantitative benchmark for research in computer graphics and vision. At the time of this technical report, ShapeNet has indexed more than 3,000,000 models, 220,000 models out of which are classified into 3,135 categories (WordNet synsets). In this report we describe the ShapeNet effort as a whole, provide details for all currently available datasets, and summarize future plans.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

dataset 12 background 8 method 1

citation-polarity summary

background 12 use dataset 7 unclear 1 use method 1

claims ledger

abstract We present ShapeNet: a richly-annotated, large-scale repository of shapes represented by 3D CAD models of objects. ShapeNet contains 3D models from a multitude of semantic categories and organizes them under the WordNet taxonomy. It is a collection of datasets providing many semantic annotations for each 3D model such as consistent rigid alignments, parts and bilateral symmetry planes, physical sizes, keywords, as well as other planned annotations. Annotations are made available through a public web-based interface to enable data visualization of object attributes, promote data-driven geometri

authors

Angel X. Chang Leonidas Guibas Pat Hanrahan Qixing Huang Thomas Funkhouser Zimo Li

co-cited works

representative citing papers

Towards Realistic 3D Emission Materials: Dataset, Baseline, and Evaluation for Emission Texture Generation

cs.CV · 2026-04-13 · unverdicted · novelty 8.0

The work creates the first dataset and baseline for generating emission textures on 3D objects to reproduce glowing materials from input images.

ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data

cs.CV · 2021-11-17 · accept · novelty 8.0

ARKitScenes is the largest real-world indoor RGB-D dataset captured with mobile LiDAR, including high-resolution depth maps and 3D furniture bounding box annotations for advancing object detection and depth upsampling.

WarpHammer: Densifying Scene Warps with 3D Object Priors for Extreme View Synthesis

cs.CV · 2026-06-30 · unverdicted · novelty 7.0

WarpHammer densifies scene warps with 3D object priors from generative models and fuses pose-unknown auxiliary views via multi-view geometry to enable stable extreme novel view synthesis.

BIM-Edit: Benchmarking Large Language Models for IFC-Based Building Information Modeling

cs.AI · 2026-06-18 · unverdicted · novelty 7.0

BIM-Edit benchmark finds best LLM scores only 49.5% average across geometric, semantic, and topological metrics on 324 IFC editing tasks, with no model fully solving more than 3.4%.

FllumaOne: A Code-Native Multimodal CAD Dataset with Executable Programs and Kernel-Validated Feature Histories

cs.AI · 2026-06-16 · unverdicted · novelty 7.0

FllumaOne releases 100,000 kernel-validated CAD models as executable Python programs with aligned multimodal data including feature histories and geometry exports.

3D-CoS: A New 3D Reconstruction Paradigm Based on VLM Code Synthesis

cs.CV · 2026-06-09 · unverdicted · novelty 7.0

3D-CoS represents 3D objects as Blender code generated by VLMs, with workflows for planning, RAG, and agents, showing better edit fidelity than point-cloud baselines.

Rethinking 3D Shape Generation: Diffusion over Superquadrics

cs.CV · 2026-06-08 · unverdicted · novelty 7.0

Diffusion for 3D shapes is moved from dense geometry to compact superquadric parameter sets, cutting state size to roughly 7 KB per shape and enabling faster generation plus new editing capabilities.

Why Far Looks Up: Probing Spatial Representation in Vision-Language Models

cs.CV · 2026-05-28 · conditional · novelty 7.0

VLMs exhibit consistent vertical-distance entanglement in embeddings from perspective bias in natural images, producing accuracy gaps that a new synthetic benchmark SpatialTunnel exposes as model-intrinsic.

Category-Level 3D Correspondence in Camera Space via Morphable Object Priors

cs.CV · 2026-05-27 · unverdicted · novelty 7.0

Morpheus learns morphable category-level shape priors to produce implicit 3D correspondences in camera space without explicit supervision and releases the HouseCorr3D benchmark with amodal and symmetry annotations.

Metric--Phase Fields: Decoupling Distance and Sign for Thin-Structure Reconstruction from Unoriented Point Clouds

cs.CV · 2026-05-25 · unverdicted · novelty 7.0

Metric-Phase Fields decouple unsigned metric proximity from a smooth phase field with learnable sharpness to enable faithful reconstruction of thin and open structures from point clouds.

ArtSplat: Feed-Forward Articulated 3D Gaussian Splatting from Sparse Multi-State Uncalibrated Views

cs.CV · 2026-05-23 · unverdicted · novelty 7.0

ArtSplat is the first feed-forward framework for articulated 3D Gaussian Splatting that reconstructs geometry and joints from sparse multi-state uncalibrated views in one pass.

MAPS: A Synthetic Dataset for Probing Vision Models in a Controlled 3D Scene Space

cs.CV · 2026-05-19 · unverdicted · novelty 7.0

MAPS provides 2618 validated 3D meshes and a controllable rendering pipeline to attribute vision model recognition failures to specific scene parameters, finding camera distance and elevation as the dominant failure factors across 20 tested models.

OffsetAxis: UDF Mesh Reconstruction via Offset-Volume Medial Axis Extraction

cs.GR · 2026-05-14 · unverdicted · novelty 7.0

OffsetAxis reconstructs meshes from unsigned distance fields by extracting the medial axis of the alpha-offset volume using ray casting and variational medial ball optimization.

Min Generalized Sliced Gromov Wasserstein: A Scalable Path to Gromov Wasserstein

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

min-GSGW learns coupled nonlinear slicers to produce a rigid-motion-invariant, scalable approximation to the Gromov-Wasserstein distance and its transport plans.

Img2CADSeq: Image-to-CAD Generation via Sequence-Based Diffusion

cs.CV · 2026-05-13 · unverdicted · novelty 7.0

Img2CADSeq generates standard CAD sequences from images via a multi-stage pipeline with three-level hierarchical codebook encoding, importance-guided compression, and contrastive point-cloud conditioning of a VQ-Diffusion model, outperforming prior methods on new CAD-220K and PrintCAD datasets.

Count Anything at Any Granularity

cs.CV · 2026-05-11 · unverdicted · novelty 7.0

Multi-grained counting is introduced with five granularity levels, supported by the new KubriCount dataset generated via 3D synthesis and editing, and HieraCount model that combines text and visual exemplars for improved accuracy.

The Wittgensteinian Representation Hypothesis: Is Language the Attractor of Multimodal Convergence?

cs.AI · 2026-05-10 · unverdicted · novelty 7.0

Language representations serve as the asymptotic attractor for convergence in independently trained multimodal neural networks due to feature density asymmetry.

MeshFIM: Local Low-Poly Mesh Editing via Fill-in-the-Middle Autoregressive Generation

cs.GR · 2026-05-09 · unverdicted · novelty 7.0

MeshFIM enables local low-poly mesh editing by autoregressively filling target regions conditioned on context, using boundary markers, positional embeddings, and a gated geometry encoder to enforce attachment, topology, and region limits.

Rollback-Free Stable Brick Structures Generation

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

Reinforcement learning internalizes physical stability rules for brick structures, enabling the first rollback-free generation with orders-of-magnitude faster inference.

Two Steps Are All You Need: Efficient 3D Point Cloud Anomaly Detection with Consistency Models

cs.CV · 2026-05-06 · unverdicted · novelty 7.0

Consistency learning reformulates 3D point cloud anomaly detection to predict clean geometry directly in one or two steps, yielding up to 80 times faster inference while matching state-of-the-art accuracy.

ADS: Random Sampling of Occupancy Functions using Adaptive Delaunay Scaffolding

cs.GR · 2026-05-05 · unverdicted · novelty 7.0

ADS adaptively refines a Delaunay scaffold to produce unbiased random samples on occupancy function surfaces together with a connecting mesh, using far fewer evaluations than existing approaches.

Generative Modeling with Orbit-Space Particle Flow Matching

cs.GR · 2026-05-04 · unverdicted · novelty 7.0

OGPP is a particle flow-matching method using orbit-space canonicalization and geometric paths that achieves lower error and fewer steps than prior approaches on 3D benchmarks.

AirZoo: A Unified Large-Scale Dataset for Grounding Aerial Geometric 3D Vision

cs.CV · 2026-04-29 · unverdicted · novelty 7.0 · 2 refs

AirZoo is a new dataset covering 378 regions across 22 countries with pixel-level metric depth and 6-DoF poses, shown via benchmarks to improve SoTA models on aerial image retrieval, cross-view matching, and multi-view 3D reconstruction.

Topo-ADV: Generating Topology-Driven Imperceptible Adversarial Point Clouds

cs.CV · 2026-04-10 · unverdicted · novelty 7.0

Topo-ADV uses differentiable persistent homology to create topology-altering perturbations that achieve up to 100% attack success on point cloud classifiers like PointNet while remaining geometrically imperceptible.

citing papers explorer

Showing 45 of 145 citing papers.

Restore3D: Breathing Life into Broken Objects with Shape and Texture Restoration cs.CV · 2026-07-01 · unverdicted · none · ref 8 · internal anchor
Restore3D restores shape and texture of broken 3D objects via multi-view image refinement with a Mask Self-Perceiver and coarse-to-fine mesh reconstruction, outperforming baselines on synthetic and real benchmarks.
AC3S: Adaptive Conditioning for 3D-Aware Synthetic Data Generation cs.CV · 2026-06-30 · unverdicted · none · ref 6 · internal anchor
AC3S adds a self-supervised visual prompt modulator to ControlNet diffusion and a multi-agent VLM prompt composer to generate photorealistic images with accurate 2D/3D annotations while avoiding over-conditioning.
ReScene: Structured Indoor Scene Reconstruction from Multi-View Captures cs.CV · 2026-06-26 · unverdicted · none · ref 28 · internal anchor
ReScene introduces HierView for view prioritization and Relation-Aware Assembly for scene graph fusion, reporting 17% lower Chamfer Distance and 26% lower LPIPS than prior baselines on ScanNet while running faster.
3D-DLP: Self-Supervised 3D Object-Centric Scene Representation Learning cs.LG · 2026-06-17 · unverdicted · none · ref 2 · internal anchor
3D-DLP decomposes 3D scenes into controllable latent particles via self-supervised reconstruction for improved robotic tasks.
Efficient RWKV-based Representation Learning for 3D Point Clouds cs.CV · 2026-06-09 · unverdicted · none · ref 43 · internal anchor
Introduces P-RWKV block and PointER self-supervised framework to adapt RWKV for efficient 3D point cloud representation learning.
Inside the Visual Mind: Neuroscience-Motivated Concept Circuits for Interpreting and Steering Vision Transformers cs.CV · 2026-06-04 · unverdicted · none · ref 125 · internal anchor
ViSAE supplies a 64K-image probing suite with 16K concepts, top-down/bottom-up circuit algorithms, and editing methods that raise WaterBirds worst-group accuracy by 48.2% over baselines.
MeshWeaver: Sparse-Voxel-Guided Surface Weaving for Autoregressive Mesh Generation cs.CV · 2026-06-03 · unverdicted · none · ref 2 · internal anchor
MeshWeaver uses sparse-voxel guidance for autoregressive surface weaving to achieve 18% compression and generate up to 16K-face meshes with improved fidelity.
Artiverse: A Diverse and Physically Grounded Dataset for Articulated Objects cs.CV · 2026-05-23 · unverdicted · none · ref 5 · internal anchor
Artiverse is a new dataset of 5.4K human-authored articulated 3D objects with detailed annotations for parts, multi-DoF joints, interior structures, and physical attributes to enable functional modeling and physics-based interaction.
Unified 3D Scene Understanding Through Physical World Modeling cs.CV · 2026-05-23 · unverdicted · none · ref 3 · internal anchor
A probabilistic graphical model called 3WM unifies 3D vision tasks into one system that performs them zero-shot by selecting different inference pathways through multimodal scene nodes.
EVA01: Unified Native 3D Understanding and Generation via Mixture-of-Transformers cs.CV · 2026-05-16 · unverdicted · none · ref 5 · internal anchor
EVA01 introduces a Mixture-of-Transformers model that natively adds 3D mesh understanding, generation, and multi-turn editing to MLLMs by decoupling understanding and generation experts with shared global self-attention.
Parameter Efficient Multi-Class Intelligent Scheduling for Multimodal Online Distributed Industrial Anomaly Detection cs.LG · 2026-05-15 · unverdicted · none · ref 42 · internal anchor
Proposes MODIAD framework with MIS scheduling solved via SMG algorithm and REC-LoRA adaptation for efficient multimodal online distributed industrial anomaly detection, reporting superior performance on MVTec 3D-AD and Eyecandies datasets.
EvObj: Learning Evolving Object-centric Representations for 3D Instance Segmentation without Scene Supervision cs.CV · 2026-05-13 · unverdicted · none · ref 5 · internal anchor
EvObj learns evolving object-centric representations for unsupervised 3D instance segmentation by dynamically refining object candidates and completing partial geometries to bridge the synthetic-to-real domain gap, outperforming baselines on real and synthetic datasets.
Symmetry in the Wild: The Role of Equivariance in Neural Fluid Surrogates cs.LG · 2026-05-12 · unverdicted · none · ref 5 · internal anchor
Explicit E(3)-equivariance in neural CFD surrogates improves generalization on diverse-geometry hemodynamics benchmarks but degrades in-distribution performance on strongly aligned aerodynamics data, consistently beating data augmentation.
Syn4D: A Multiview Synthetic 4D Dataset cs.CV · 2026-05-06 · unverdicted · none · ref 18 · internal anchor
Syn4D is a new multiview synthetic 4D dataset supplying dense ground-truth annotations for dynamic scene reconstruction, tracking, and human pose estimation.
Channel-Level Relation to Attentive Aggregation with Neighborhood-Homogeneity Constraint for Point Cloud Analysis cs.CV · 2026-05-04 · unverdicted · none · ref 34 · 2 links · internal anchor
PointCRA reduces information loss in deep point cloud networks by treating temporal trend variation as an extra evaluation dimension alongside spatial and channel attention, guided by a neighborhood homogeneity constraint.
From Visual Synthesis to Interactive Worlds: Toward Production-Ready 3D Asset Generation cs.GR · 2026-04-26 · unverdicted · none · ref 42 · 2 links · internal anchor
The paper surveys 3D asset generation methods and organizes them around the full production pipeline to assess which outputs meet engine-level requirements for interactive applications.
AmaraSpatial-10K: A Spatially and Semantically Aligned 3D Dataset for Spatial Computing and Embodied AI cs.CV · 2026-04-24 · unverdicted · none · ref 1 · 2 links · internal anchor
AmaraSpatial-10K supplies 10K deployment-ready 3D assets with metric scaling and metadata, delivering 3.4x higher CLIP Recall@5 than Objaverse and 99.1% physics stability in Habitat-Sim.
Unposed-to-3D: Learning Simulation-Ready Vehicles from Real-World Images cs.CV · 2026-04-21 · unverdicted · none · ref 3 · internal anchor
Unposed-to-3D learns simulation-ready 3D vehicle models from unposed real images by predicting camera parameters for photometric self-supervision, then adding scale prediction and harmonization.
Neural Distribution Prior for LiDAR Out-of-Distribution Detection cs.CV · 2026-04-10 · unverdicted · none · ref 10 · internal anchor
NDP models prediction distributions and uses Perlin noise OOD synthesis to reach 61.31% point-level AP on STU LiDAR benchmark, over 10x prior best.
Hierarchical Feature Learning for Medical Point Clouds via State Space Model cs.CV · 2025-04-17 · unverdicted · none · ref 1 · internal anchor
Presents an SSM-based hierarchical feature learning method for medical point clouds that reports superior performance on classification, completion, and segmentation using a new dataset MedPointS.
Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model cs.CV · 2023-10-23 · unverdicted · none · ref 1 · internal anchor
Zero123++ produces high-quality 3D-consistent multi-view images from a single input by fine-tuning Stable Diffusion with targeted conditioning and training methods.
DetailCLIP: Injecting Image Details into CLIP's Feature Space cs.CV · 2022-08-31 · unverdicted · none · ref 4 · internal anchor
A patch-based fusion method extends CLIP to high-resolution images by retaining multi-scale details for improved class-prompted retrieval.
SynthCity: A large scale synthetic point cloud cs.CV · 2019-07-10 · unverdicted · none · ref 4 · internal anchor
SynthCity is a 367.9M point synthetic full-colour Mobile Laser Scanning point cloud with per-point labels from nine categories, generated in Blender for an urban environment.
Linkify: Learning from Interface-Augmented Assembly Graphs cs.CV · 2026-07-01 · unverdicted · none · ref 17 · internal anchor
Linkify augments assembly graphs with corrected interface point clouds and trains GATv2 for masked part prediction, outperforming non-graph baselines on Fusion 360 data.
Mitigating Positional Leakage in 3D Masked Autoencoders for Robust Representation Learning cs.CV · 2026-06-30 · unverdicted · none · ref 3 · internal anchor
MPL-MAE introduces recalibrated positional embedding and gated positional interface modules to reduce positional over-reliance in 3D masked autoencoders and improve semantic representation quality.
Domain Generalizable Adaptation of 3D Vision-Language Models via Regularized Fine-Tuning cs.CV · 2026-06-16 · unverdicted · none · ref 43 · internal anchor
ReFine3D uses selective layer tuning, multi-view consistency regularization, LLM-generated text diversity, point-rendered supervision, and confidence-weighted test-time augmentation to improve domain generalization in 3D LMMs by 1-3% on benchmarks.
Kwai Keye-VL-2.0 Technical Report cs.CV · 2026-06-09 · unverdicted · none · ref 130 · internal anchor
Kwai Keye-VL-2.0-30B-A3B is a 30B MoE model with 3B active parameters using DSA adaptation and MOPD distillation that reports SOTA results on video understanding and agent benchmarks.
Learning Representations from 3D Gaussian Splats cs.CV · 2026-05-28 · unverdicted · none · ref 4 · internal anchor
Comparative benchmark of geometric deep learning models on 3D Gaussian Splatting representations for scene classification via end-to-end training, linear probing, and clustering.
Uni-RCM: Unified Reference-guided Cross-modal Mapping for Multi-Class Anomaly Detection cs.CV · 2026-05-28 · unverdicted · none · ref 21 · internal anchor
Uni-RCM achieves state-of-the-art multi-class anomaly detection on MVTec-3D AD via a reference guide block and offline residual quantizer.
RETO: A Rotary-Enhanced Transformer Operator for High-Fidelity Prediction of Automotive Aerodynamics eess.IV · 2026-04-30 · unverdicted · none · ref 17 · internal anchor
RETO achieves relative L2 errors of 0.063 on ShapeNet and 0.089/0.097 on DrivAerML surface pressure/velocity, outperforming Transolver and other baselines.
Reinforcing 3D Understanding in Point-VLMs via Geometric Reward Credit Assignment cs.CV · 2026-04-23 · unverdicted · none · ref 5 · internal anchor
Geometric Reward Credit Assignment disentangles rewards to geometric tokens and adds reprojection consistency to boost 3D keypoint accuracy from 0.64 to 0.93 and bounding box IoU to 0.686 on a ShapeNetCore benchmark while preserving 2D performance.
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation cs.CV · 2025-01-21 · unverdicted · none · ref 8 · internal anchor
Hunyuan3D 2.0 scales flow-based diffusion transformers and texture synthesis models to generate high-resolution textured 3D assets that outperform prior state-of-the-art in geometry, alignment, and texture quality.
AI+CAD Data Representation Architecture: From DeepCAD Solid Modeling to WHUCAD Industrial-Level Parametric Feature Modeling cs.GR · 2026-06-15 · unverdicted · none · ref 2 · internal anchor
The paper classifies AI+CAD data representations and argues WHUCAD's three-level architecture provides better foundational support for industrial parametric feature modeling than DeepCAD.
Benchmarking stereo reconstruction for 3D printable Martian terrain models cs.CV · 2026-06-09 · unverdicted · none · ref 1 · internal anchor
RAFT-Stereo outperforms SGBM on Middlebury but shows weaker edge alignment and higher reprojection error on Curiosity imagery, while geometry completion trades local accuracy for mesh connectivity in printable models.
Principles and Practice of Deep Representation Learning: or a Mathematical Theory of Memory cs.LG · 2026-06-04 · unverdicted · none · ref 13 · internal anchor
The book presents principles from optimization and information theory to explain deep network architectures and enable new interpretable models.
Attention Is not Everything: Efficient Alternatives for Vision cs.CV · 2026-04-19 · unverdicted · none · ref 34 · internal anchor
A survey that taxonomizes non-Transformer vision models and evaluates their practical trade-offs across efficiency, scalability, and robustness.
A Survey on 3D Gaussian Splatting Applications: Segmentation, Editing, and Generation cs.CV · 2025-08-13 · unverdicted · none · ref 284 · internal anchor
A survey that categorizes and summarizes methods applying 3D Gaussian Splatting to segmentation, editing, generation, and related tasks, including datasets and evaluation protocols.
Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material cs.CV · 2025-06-18 · unverdicted · none · ref 28 · internal anchor
Hunyuan3D 2.1 is a two-part system with DiT for shape generation and Paint for texture synthesis that produces high-fidelity 3D assets with PBR materials.
Advances in Neural 3D Mesh Texturing: A Survey cs.CV · 2026-05-28 · unverdicted · none · ref 145 · internal anchor
A literature survey that organizes neural 3D mesh texturing methods into a taxonomy spanning early GAN-based approaches to modern diffusion pipelines, while reviewing architectures, datasets, evaluation, and open challenges.
3D Generation for Embodied AI and Robotic Simulation: A Survey cs.RO · 2026-04-29 · unverdicted · none · ref 194 · 3 links · internal anchor
The paper surveys 3D generation techniques for embodied AI and robotics, categorizing them into data generation, simulation environments, and sim-to-real bridging while identifying bottlenecks in physical validity and transfer.
NeRF: Neural Radiance Field in 3D Vision: A Comprehensive Review (Updated Post-Gaussian Splatting) cs.CV · 2022-10-01 · unverdicted · none · ref 32 · internal anchor
A literature survey of NeRF and neural field methods from 2020-2025, organized by architecture and application taxonomies with benchmarks and dataset overviews, covering both pre- and post-Gaussian Splatting periods.
A review on deep learning techniques for 3D sensed data classification cs.CV · 2019-07-09 · unverdicted · none · ref 45 · internal anchor
A survey of deep learning architectures for 3D sensed data classification covering RGB-D, multi-view, volumetric and end-to-end methods along with datasets and future directions.
L-PCN: A Point Cloud Accelerator Exploiting Spatial Locality through Octree-based Islandization cs.AR · 2026-04-12 · unreviewed · ref 4 · internal anchor
Picasso: Holistic Scene Reconstruction with Physics-Constrained Sampling cs.CV · 2026-02-08 · unreviewed · ref 8 · internal anchor
Efficient Transferable Optimal Transport via Min-Sliced Transport Plans cs.CV · 2025-11-24 · unreviewed · ref 8 · internal anchor

ShapeNet: An Information-Rich 3D Model Repository

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer