hub

Native and Compact Structured Latents for 3D Generation.arXiv preprint arXiv:2512.14692, 2025a

Jianfeng Xiang, Xiaoxue Chen, Sicheng Xu, Ruicheng Wang, Zelong Lv, Yu Deng, Hongyuan Zhu, Yue Dong, Hao Zhao, Nicholas Jing Yuan, Jiaolong Yang · 2025 · arXiv 2512.14692

14 Pith papers cite this work. Polarity classification is still indexing.

14 Pith papers citing it

read on arXiv browse 14 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Rigel3D: Rig-aware Latents for Animation-Ready 3D Asset Generation

cs.GR · 2026-05-13 · unverdicted · novelty 8.0

Rigel3D jointly generates rigged 3D meshes with geometry, skeleton topology, joint positions, and skinning weights using coupled surface and skeleton latent representations for image-conditioned animation-ready asset synthesis.

On the Generation and Mitigation of Harmful Geometry in Image-to-3D Models

cs.CR · 2026-05-10 · conditional · novelty 8.0

Image-to-3D models successfully generate harmful geometries in most cases with under 0.3% caught by commercial filters; existing safeguards are weak but a stacked defense cuts harmful outputs to under 1% at 11% false-positive cost.

Count Anything at Any Granularity

cs.CV · 2026-05-11 · unverdicted · novelty 7.0

Multi-grained counting is introduced with five granularity levels, supported by the new KubriCount dataset generated via 3D synthesis and editing, and HieraCount model that combines text and visual exemplars for improved accuracy.

Velocity-Space 3D Asset Editing

cs.GR · 2026-05-08 · unverdicted · novelty 7.0

VS3D performs local 3D asset editing by injecting reconstruction-anchored source signals, partial-mean guidance, and twin-agreement residuals into the velocity sampler to control edit strength and preserve identity.

Geometrically Consistent Multi-View Scene Generation from Freehand Sketches

cs.CV · 2026-04-15 · unverdicted · novelty 7.0

A framework generates consistent multi-view scenes from one freehand sketch via a ~9k-sample dataset, Parallel Camera-Aware Attention Adapters, and Sparse Correspondence Supervision Loss, outperforming baselines in realism and consistency.

Any 3D Scene is Worth 1K Tokens: 3D-Grounded Representation for Scene Generation at Scale

cs.CV · 2026-04-13 · unverdicted · novelty 7.0

A 3D-grounded autoencoder and diffusion transformer allow direct generation of 3D scenes in an implicit latent space using a fixed 1K-token representation for arbitrary views and resolutions.

Pixal3D: Pixel-Aligned 3D Generation from Images

cs.CV · 2026-05-11 · unverdicted · novelty 6.0

Pixal3D performs pixel-aligned 3D generation from images via back-projected multi-scale feature volumes, achieving fidelity close to reconstruction while supporting multi-view and scene synthesis.

DVD: Discrete Voxel Diffusion for 3D Generation and Editing

cs.CV · 2026-05-08 · unverdicted · novelty 6.0

DVD treats voxel occupancy as a discrete variable in a diffusion framework to generate, assess, and edit sparse 3D voxels without continuous thresholding.

LSRM: High-Fidelity Object-Centric Reconstruction via Scaled Context Windows

cs.CV · 2026-04-06 · conditional · novelty 6.0

LSRM scales transformer context windows with native sparse attention and geometric routing to deliver high-fidelity feed-forward 3D reconstruction and inverse rendering that approaches dense optimization quality.

Pose Tracking with a Foundation Pose Model and an Ensemble Directional Kalman Filter

cs.LG · 2026-05-04 · unverdicted · novelty 5.0

EnDKF combines ensemble Kalman filtering with directional statistics and unit quaternions to achieve lower pose tracking error than raw measurements in synthetic constant-velocity tests and FoundationPose-based head tracking.

From Visual Synthesis to Interactive Worlds: Toward Production-Ready 3D Asset Generation

cs.GR · 2026-04-26 · unverdicted · novelty 5.0 · 2 refs

The paper surveys 3D asset generation methods and organizes them around the full production pipeline to assess which outputs meet engine-level requirements for interactive applications.

Asset Harvester: Extracting 3D Assets from Autonomous Driving Logs for Simulation

cs.CV · 2026-04-20 · unverdicted · novelty 5.0

Asset Harvester converts sparse in-the-wild object observations from AV driving logs into complete simulation-ready 3D assets via data curation, geometry-aware preprocessing, and a SparseViewDiT model that couples sparse-view multiview generation with 3D Gaussian lifting.

Hitem3D 2.0: Multi-View Guided Native 3D Texture Generation

cs.CV · 2026-04-10 · unverdicted · novelty 5.0

Hitem3D 2.0 combines multi-view image synthesis with native 3D texture projection to improve completeness, cross-view consistency, and geometry alignment over prior methods.

3D Generation for Embodied AI and Robotic Simulation: A Survey

cs.RO · 2026-04-29 · unverdicted · novelty 2.0 · 3 refs

The paper surveys 3D generation techniques for embodied AI and robotics, categorizing them into data generation, simulation environments, and sim-to-real bridging while identifying bottlenecks in physical validity and transfer.

citing papers explorer

Showing 14 of 14 citing papers.

Rigel3D: Rig-aware Latents for Animation-Ready 3D Asset Generation cs.GR · 2026-05-13 · unverdicted · none · ref 11
Rigel3D jointly generates rigged 3D meshes with geometry, skeleton topology, joint positions, and skinning weights using coupled surface and skeleton latent representations for image-conditioned animation-ready asset synthesis.
On the Generation and Mitigation of Harmful Geometry in Image-to-3D Models cs.CR · 2026-05-10 · conditional · none · ref 47
Image-to-3D models successfully generate harmful geometries in most cases with under 0.3% caught by commercial filters; existing safeguards are weak but a stacked defense cuts harmful outputs to under 1% at 11% false-positive cost.
Count Anything at Any Granularity cs.CV · 2026-05-11 · unverdicted · none · ref 74
Multi-grained counting is introduced with five granularity levels, supported by the new KubriCount dataset generated via 3D synthesis and editing, and HieraCount model that combines text and visual exemplars for improved accuracy.
Velocity-Space 3D Asset Editing cs.GR · 2026-05-08 · unverdicted · none · ref 7
VS3D performs local 3D asset editing by injecting reconstruction-anchored source signals, partial-mean guidance, and twin-agreement residuals into the velocity sampler to control edit strength and preserve identity.
Geometrically Consistent Multi-View Scene Generation from Freehand Sketches cs.CV · 2026-04-15 · unverdicted · none · ref 51
A framework generates consistent multi-view scenes from one freehand sketch via a ~9k-sample dataset, Parallel Camera-Aware Attention Adapters, and Sparse Correspondence Supervision Loss, outperforming baselines in realism and consistency.
Any 3D Scene is Worth 1K Tokens: 3D-Grounded Representation for Scene Generation at Scale cs.CV · 2026-04-13 · unverdicted · none · ref 77
A 3D-grounded autoencoder and diffusion transformer allow direct generation of 3D scenes in an implicit latent space using a fixed 1K-token representation for arbitrary views and resolutions.
Pixal3D: Pixel-Aligned 3D Generation from Images cs.CV · 2026-05-11 · unverdicted · none · ref 12
Pixal3D performs pixel-aligned 3D generation from images via back-projected multi-scale feature volumes, achieving fidelity close to reconstruction while supporting multi-view and scene synthesis.
DVD: Discrete Voxel Diffusion for 3D Generation and Editing cs.CV · 2026-05-08 · unverdicted · none · ref 16
DVD treats voxel occupancy as a discrete variable in a diffusion framework to generate, assess, and edit sparse 3D voxels without continuous thresholding.
LSRM: High-Fidelity Object-Centric Reconstruction via Scaled Context Windows cs.CV · 2026-04-06 · conditional · none · ref 66
LSRM scales transformer context windows with native sparse attention and geometric routing to deliver high-fidelity feed-forward 3D reconstruction and inverse rendering that approaches dense optimization quality.
Pose Tracking with a Foundation Pose Model and an Ensemble Directional Kalman Filter cs.LG · 2026-05-04 · unverdicted · none · ref 30
EnDKF combines ensemble Kalman filtering with directional statistics and unit quaternions to achieve lower pose tracking error than raw measurements in synthetic constant-velocity tests and FoundationPose-based head tracking.
From Visual Synthesis to Interactive Worlds: Toward Production-Ready 3D Asset Generation cs.GR · 2026-04-26 · unverdicted · none · ref 34 · 2 links
The paper surveys 3D asset generation methods and organizes them around the full production pipeline to assess which outputs meet engine-level requirements for interactive applications.
Asset Harvester: Extracting 3D Assets from Autonomous Driving Logs for Simulation cs.CV · 2026-04-20 · unverdicted · none · ref 38
Asset Harvester converts sparse in-the-wild object observations from AV driving logs into complete simulation-ready 3D assets via data curation, geometry-aware preprocessing, and a SparseViewDiT model that couples sparse-view multiview generation with 3D Gaussian lifting.
Hitem3D 2.0: Multi-View Guided Native 3D Texture Generation cs.CV · 2026-04-10 · unverdicted · none · ref 49
Hitem3D 2.0 combines multi-view image synthesis with native 3D texture projection to improve completeness, cross-view consistency, and geometry alignment over prior methods.
3D Generation for Embodied AI and Robotic Simulation: A Survey cs.RO · 2026-04-29 · unverdicted · none · ref 102 · 3 links
The paper surveys 3D generation techniques for embodied AI and robotics, categorizing them into data generation, simulation environments, and sim-to-real bridging while identifying bottlenecks in physical validity and transfer.

Native and Compact Structured Latents for 3D Generation.arXiv preprint arXiv:2512.14692, 2025a

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer