hub

Cat3d: Create any- thing in 3d with multi-view diffusion models

URLhttps://arxiv · 2024 · arXiv 2405.10314

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

read on arXiv browse 13 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

representative citing papers

CRePE: Curved Ray Expectation Positional Encoding for Unified-Camera-Controlled Video Generation

cs.CV · 2026-05-13 · unverdicted · novelty 7.0

CRePE supplies depth-aware positional distributions along curved rays for stable unified-camera control in frozen video DiT models.

GSCompleter: A Distillation-Free Plugin for Metric-Aware 3D Gaussian Splatting Completion in Seconds

cs.CV · 2026-04-22 · unverdicted · novelty 7.0

GSCompleter completes sparse 3D Gaussian Splatting scenes via a distillation-free generate-then-register pipeline using Stereo-Anchor lifting and Ray-Constrained Registration, delivering SOTA results on three benchmarks.

UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Models

cs.CV · 2026-04-19 · unverdicted · novelty 7.0 · 2 refs

UniGeo unifies geometric guidance across three levels in video models to reduce geometric drift and improve consistency in camera-controllable image editing.

Geometrically Consistent Multi-View Scene Generation from Freehand Sketches

cs.CV · 2026-04-15 · unverdicted · novelty 7.0

A framework generates consistent multi-view scenes from one freehand sketch via a ~9k-sample dataset, Parallel Camera-Aware Attention Adapters, and Sparse Correspondence Supervision Loss, outperforming baselines in realism and consistency.

Towards Realistic and Consistent Orbital Video Generation via 3D Foundation Priors

cs.CV · 2026-04-14 · unverdicted · novelty 7.0

A video generation approach conditions a base model with multi-scale 3D latent features and a cross-attention adapter to produce geometrically realistic and consistent orbital videos from one image.

Any 3D Scene is Worth 1K Tokens: 3D-Grounded Representation for Scene Generation at Scale

cs.CV · 2026-04-13 · unverdicted · novelty 7.0

A 3D-grounded autoencoder and diffusion transformer allow direct generation of 3D scenes in an implicit latent space using a fixed 1K-token representation for arbitrary views and resolutions.

Novel View Synthesis as Video Completion

cs.CV · 2026-04-09 · unverdicted · novelty 7.0

Video diffusion models can be adapted into permutation-invariant generators for sparse novel view synthesis by treating the problem as video completion and removing temporal order cues.

FurnSet: Exploiting Repeats for 3D Scene Reconstruction

cs.CV · 2026-04-22 · unverdicted · novelty 6.0

FurnSet improves single-view 3D scene reconstruction by using per-object CLS tokens and set-aware self-attention to group and jointly reconstruct repeated object instances, with added scene-object conditioning and layout optimization.

Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective

cs.CV · 2026-04-15 · unverdicted · novelty 6.0

The paper proposes a problem-driven taxonomy for feed-forward 3D scene modeling that groups methods by five core challenges: feature enhancement, geometry awareness, model efficiency, augmentation strategies, and temporal-aware modeling.

NavCrafter: Exploring 3D Scenes from a Single Image

cs.CV · 2026-04-03 · unverdicted · novelty 6.0

NavCrafter generates controllable novel-view videos from one image via video diffusion, geometry-aware expansion, and enhanced 3D Gaussian Splatting to achieve state-of-the-art synthesis under large viewpoint changes.

ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis

cs.CV · 2024-09-03 · unverdicted · novelty 6.0

ViewCrafter tames video diffusion models with point-based 3D guidance and iterative trajectory planning to produce high-fidelity novel views from single or sparse images.

GeoRect4D: Geometry-Compatible Generative Rectification for Dynamic Sparse-View 3D Reconstruction

cs.CV · 2026-04-22 · unverdicted · novelty 5.0

GeoRect4D couples 3D Gaussian splatting with a single-step diffusion rectifier via degradation-aware feedback and progressive optimization to improve fidelity and consistency in sparse-view dynamic 3D reconstruction.

Asset Harvester: Extracting 3D Assets from Autonomous Driving Logs for Simulation

cs.CV · 2026-04-20 · unverdicted · novelty 5.0

Asset Harvester converts sparse in-the-wild object observations from AV driving logs into complete simulation-ready 3D assets via data curation, geometry-aware preprocessing, and a SparseViewDiT model that couples sparse-view multiview generation with 3D Gaussian lifting.

citing papers explorer

Showing 13 of 13 citing papers.

CRePE: Curved Ray Expectation Positional Encoding for Unified-Camera-Controlled Video Generation cs.CV · 2026-05-13 · unverdicted · none · ref 20
CRePE supplies depth-aware positional distributions along curved rays for stable unified-camera control in frozen video DiT models.
GSCompleter: A Distillation-Free Plugin for Metric-Aware 3D Gaussian Splatting Completion in Seconds cs.CV · 2026-04-22 · unverdicted · none · ref 23
GSCompleter completes sparse 3D Gaussian Splatting scenes via a distillation-free generate-then-register pipeline using Stereo-Anchor lifting and Ray-Constrained Registration, delivering SOTA results on three benchmarks.
UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Models cs.CV · 2026-04-19 · unverdicted · none · ref 26 · 2 links
UniGeo unifies geometric guidance across three levels in video models to reduce geometric drift and improve consistency in camera-controllable image editing.
Geometrically Consistent Multi-View Scene Generation from Freehand Sketches cs.CV · 2026-04-15 · unverdicted · none · ref 12
A framework generates consistent multi-view scenes from one freehand sketch via a ~9k-sample dataset, Parallel Camera-Aware Attention Adapters, and Sparse Correspondence Supervision Loss, outperforming baselines in realism and consistency.
Towards Realistic and Consistent Orbital Video Generation via 3D Foundation Priors cs.CV · 2026-04-14 · unverdicted · none · ref 10
A video generation approach conditions a base model with multi-scale 3D latent features and a cross-attention adapter to produce geometrically realistic and consistent orbital videos from one image.
Any 3D Scene is Worth 1K Tokens: 3D-Grounded Representation for Scene Generation at Scale cs.CV · 2026-04-13 · unverdicted · none · ref 20
A 3D-grounded autoencoder and diffusion transformer allow direct generation of 3D scenes in an implicit latent space using a fixed 1K-token representation for arbitrary views and resolutions.
Novel View Synthesis as Video Completion cs.CV · 2026-04-09 · unverdicted · none · ref 10
Video diffusion models can be adapted into permutation-invariant generators for sparse novel view synthesis by treating the problem as video completion and removing temporal order cues.
FurnSet: Exploiting Repeats for 3D Scene Reconstruction cs.CV · 2026-04-22 · unverdicted · none · ref 14
FurnSet improves single-view 3D scene reconstruction by using per-object CLS tokens and set-aware self-attention to group and jointly reconstruct repeated object instances, with added scene-object conditioning and layout optimization.
Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective cs.CV · 2026-04-15 · unverdicted · none · ref 253
The paper proposes a problem-driven taxonomy for feed-forward 3D scene modeling that groups methods by five core challenges: feature enhancement, geometry awareness, model efficiency, augmentation strategies, and temporal-aware modeling.
NavCrafter: Exploring 3D Scenes from a Single Image cs.CV · 2026-04-03 · unverdicted · none · ref 16
NavCrafter generates controllable novel-view videos from one image via video diffusion, geometry-aware expansion, and enhanced 3D Gaussian Splatting to achieve state-of-the-art synthesis under large viewpoint changes.
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis cs.CV · 2024-09-03 · unverdicted · none · ref 63
ViewCrafter tames video diffusion models with point-based 3D guidance and iterative trajectory planning to produce high-fidelity novel views from single or sparse images.
GeoRect4D: Geometry-Compatible Generative Rectification for Dynamic Sparse-View 3D Reconstruction cs.CV · 2026-04-22 · unverdicted · none · ref 16
GeoRect4D couples 3D Gaussian splatting with a single-step diffusion rectifier via degradation-aware feedback and progressive optimization to improve fidelity and consistency in sparse-view dynamic 3D reconstruction.
Asset Harvester: Extracting 3D Assets from Autonomous Driving Logs for Simulation cs.CV · 2026-04-20 · unverdicted · none · ref 43
Asset Harvester converts sparse in-the-wild object observations from AV driving logs into complete simulation-ready 3D assets via data curation, geometry-aware preprocessing, and a SparseViewDiT model that couples sparse-view multiview generation with 3D Gaussian lifting.

Cat3d: Create any- thing in 3d with multi-view diffusion models

hub tools

fields

years

verdicts

representative citing papers

citing papers explorer