hub

Lrm: Large reconstruction model for single image to 3d

Lrm: Large reconstruction model for single image to 3d , author= · 2023 · arXiv 2311.04400

19 Pith papers cite this work. Polarity classification is still indexing.

19 Pith papers citing it

read on arXiv browse 19 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

representative citing papers

On the Generation and Mitigation of Harmful Geometry in Image-to-3D Models

cs.CR · 2026-05-10 · conditional · novelty 8.0

Image-to-3D models successfully generate harmful geometries in most cases with under 0.3% caught by commercial filters; existing safeguards are weak but a stacked defense cuts harmful outputs to under 1% at 11% false-positive cost.

MiXR: Harvesting and Recomposing Geometry from Real-World Objects for In-Situ 3D Design

cs.HC · 2026-05-10 · unverdicted · novelty 7.0

MiXR enables in-situ 3D design by harvesting real-world geometry for user-defined compositions that generative AI then refines, outperforming text-only generative methods in control and fidelity per a 12-person study.

MeshFIM: Local Low-Poly Mesh Editing via Fill-in-the-Middle Autoregressive Generation

cs.GR · 2026-05-09 · unverdicted · novelty 7.0

MeshFIM enables local low-poly mesh editing by autoregressively filling target regions conditioned on context, using boundary markers, positional embeddings, and a gated geometry encoder to enforce attachment, topology, and region limits.

Large-Scale High-Quality 3D Gaussian Head Reconstruction from Multi-View Captures

cs.CV · 2026-05-05 · unverdicted · novelty 7.0

HeadsUp maps multi-view captures to UV-parameterized 3D Gaussians on a template via an encoder-decoder, achieving state-of-the-art quality and generalization after training on more than 10,000 subjects.

URoPE: Universal Relative Position Embedding across Geometric Spaces

cs.CV · 2026-04-20 · unverdicted · novelty 7.0

URoPE is a parameter-free relative position embedding for transformers that works across arbitrary geometric spaces by ray sampling and projection, yielding consistent gains on novel view synthesis, 3D detection, tracking, and depth estimation.

TokenGS: Decoupling 3D Gaussian Prediction from Pixels with Learnable Tokens

cs.CV · 2026-04-16 · unverdicted · novelty 7.0

TokenGS uses learnable Gaussian tokens in an encoder-decoder architecture to regress 3D means directly, achieving SOTA feed-forward reconstruction on static and dynamic scenes with better robustness.

Towards Realistic and Consistent Orbital Video Generation via 3D Foundation Priors

cs.CV · 2026-04-14 · unverdicted · novelty 7.0

A video generation approach conditions a base model with multi-scale 3D latent features and a cross-attention adapter to produce geometrically realistic and consistent orbital videos from one image.

AnchorSplat: Feed-Forward 3D Gaussian Splatting with 3D Geometric Priors

cs.CV · 2026-04-08 · unverdicted · novelty 7.0

AnchorSplat uses anchor-aligned 3D Gaussians guided by geometric priors for feed-forward scene reconstruction, achieving SOTA novel view synthesis on ScanNet++ with fewer primitives and better view consistency.

High-Fidelity Single-Image Head Modeling with Industry-Grade Topology

cs.CV · 2026-05-06 · unverdicted · novelty 6.0

A single-image head reconstruction method uses coarse-to-fine optimization with normal consistency, landmarks, and geometry-aware constraints on curvature and conformality to produce meshes with industry-grade topology and preserved facial identity.

Structured 3D Latents Are Surprisingly Powerful: Unleashing Generalizable Style with 2D Diffusion

cs.CV · 2026-05-06 · unverdicted · novelty 6.0

DiLAST optimizes 3D latents via guidance from a 2D diffusion model to enable generalizable style transfer for OOD styles in 3D asset generation.

Repurposing 3D Generative Model for Autoregressive Layout Generation

cs.CV · 2026-04-17 · unverdicted · novelty 6.0

LaviGen turns 3D generative models into an autoregressive layout generator that models geometric and physical constraints, delivering 19% higher physical plausibility and 65% faster inference on the LayoutVLM benchmark.

Real-Time Human Reconstruction and Animation using Feed-Forward Gaussian Splatting

cs.CV · 2026-04-11 · unverdicted · novelty 6.0

A feed-forward network predicts per-SMPL-X-vertex 3D Gaussians in canonical space from multi-view RGB images, enabling single-pass reconstruction and real-time animation via linear blend skinning.

MemoryDiorama: Generating Dynamic 3D Diorama from Everyday Photos for Memory Recall

cs.HC · 2026-04-08 · unverdicted · novelty 6.0

MemoryDiorama generates animated 3D dioramas from photos via LLM scene analysis and generative components, yielding richer autobiographical recall than photo-only or static diorama baselines.

LSRM: High-Fidelity Object-Centric Reconstruction via Scaled Context Windows

cs.CV · 2026-04-06 · conditional · novelty 6.0

LSRM scales transformer context windows with native sparse attention and geometric routing to deliver high-fidelity feed-forward 3D reconstruction and inverse rendering that approaches dense optimization quality.

R-DMesh: Video-Guided 3D Animation via Rectified Dynamic Mesh Flow

cs.CV · 2026-05-13 · unverdicted · novelty 5.0

R-DMesh uses a VAE with a learned rectification jump offset and Triflow Attention inside a rectified-flow diffusion transformer to produce video-aligned 4D meshes despite initial pose misalignment.

Pose-Aware Diffusion for 3D Generation

cs.CV · 2026-05-01 · unverdicted · novelty 5.0

PAD synthesizes 3D geometry in observation space via depth unprojection as anchor to eliminate pose ambiguity in image-to-3D generation.

Unposed-to-3D: Learning Simulation-Ready Vehicles from Real-World Images

cs.CV · 2026-04-21 · unverdicted · novelty 5.0

Unposed-to-3D learns simulation-ready 3D vehicle models from unposed real images by predicting camera parameters for photometric self-supervision, then adding scale prediction and harmonization.

UniMesh: Unifying 3D Mesh Understanding and Generation

cs.CV · 2026-04-19 · unverdicted · novelty 5.0

UniMesh unifies 3D mesh generation and understanding in one model via a Mesh Head interface, Chain of Mesh iterative editing, and an Actor-Evaluator self-reflection loop.

AnimateAnyMesh++: A Flexible 4D Foundation Model for High-Fidelity Text-Driven Mesh Animation

cs.CV · 2026-04-29 · unverdicted · novelty 4.0

AnimateAnyMesh++ animates arbitrary 3D meshes from text using an expanded 300K-identity DyMesh-XL dataset, a power-law topology-aware DyMeshVAE-Flex, and a variable-length rectified-flow generator to produce semantically accurate, temporally coherent animations in seconds.

citing papers explorer

Showing 1 of 1 citing paper after filters.

MeshFIM: Local Low-Poly Mesh Editing via Fill-in-the-Middle Autoregressive Generation cs.GR · 2026-05-09 · unverdicted · none · ref 40
MeshFIM enables local low-poly mesh editing by autoregressively filling target regions conditioned on context, using boundary markers, positional embeddings, and a gated geometry encoder to enforce attachment, topology, and region limits.

Lrm: Large reconstruction model for single image to 3d

hub tools

fields

years

verdicts

representative citing papers

citing papers explorer