pith. sign in

hub

DreamGaussian4D: Generative 4D gaussian splatting.arXiv preprint arXiv:2312.17142

27 Pith papers cite this work. Polarity classification is still indexing.

27 Pith papers citing it

hub tools

citation-role summary

background 3 method 1

citation-polarity summary

verdicts

UNVERDICTED 27

clear filters

representative citing papers

Rigel3D: Rig-aware Latents for Animation-Ready 3D Asset Generation

cs.GR · 2026-05-13 · unverdicted · novelty 8.0

Rigel3D jointly generates rigged 3D meshes with geometry, skeleton topology, joint positions, and skinning weights using coupled surface and skeleton latent representations for image-conditioned animation-ready asset synthesis.

Functionalization via Structure Completion and Motion Rectification

cs.CV · 2026-05-18 · unverdicted · novelty 7.0

Object functionalization is cast as neural graph completion over a functional graph of parts, contacts, and motions, followed by geometry realization that also rectifies erroneous motions, demonstrated on furniture with a new paired dataset.

Alignment Is All You Need For X-to-4D Generation

cs.CV · 2026-07-02 · unverdicted · novelty 6.0

Align4D introduces object distance alignment, motion-geometry joint alignment, asynchronous optimization, and the X4D dataset to achieve state-of-the-art X-to-4D generation from multimodal inputs.

SimWorlds: A Multi-Agent System for Dynamic 3D Scene Creation

cs.AI · 2026-07-02 · unverdicted · novelty 6.0

SimWorlds presents a multi-agent system with planner-coder-reviewer workflow, layered scene protocol, and runtime inspection tools to create dynamic 4D scenes from text, plus the 4DBuildBench benchmark showing outperformance over baselines.

Feed-forward Motion In-betweening for Any 4D

cs.CV · 2026-06-20 · unverdicted · novelty 6.0

Proposes a feed-forward keyframe-conditioned in-betweening method for arbitrary 4D meshes using a topology-agnostic VAE and MMDiT-based rectified flow model.

DynaTok: Token-Based 4D Reconstruction from Partial Point Clouds

cs.CV · 2026-06-10 · unverdicted · novelty 6.0

DynaTok introduces a token-based framework for correspondence-free 4D reconstruction from partial point cloud sequences via latent encoding, transformer aggregation, residual decoupling, and flow-matching decoding.

Helix4D: Complex 4D Mesh Generation

cs.CV · 2026-05-25 · unverdicted · novelty 6.0

Helix4D generates high-quality dynamic 4D meshes from videos by extending Trellis2 with sliding-window cross-frame attention anchored on the first frame and a repurposed 4D temporal encoding.

Variance Reduction for Expectations with Diffusion Teachers

cs.LG · 2026-05-20 · unverdicted · novelty 6.0 · 2 refs

CARV amortizes upstream diffusion teacher costs over noise resamples with timestep importance sampling and stratified-inverse-CDF sampling, delivering 2-3x effective compute gains in text-to-3D experiments and order-of-magnitude variance cuts in single-step distillation.

Fast 4D Mesh Generation by Spatio-Temporal Attention Chains

cs.CV · 2026-05-19 · unverdicted · novelty 6.0

A training-free Spatio-Temporal Attention Chain framework accelerates 4D mesh generation 13x, improves quality, scales to 16x longer videos, and supports downstream tracking and camera estimation.

R-DMesh: Video-Guided 3D Animation via Rectified Dynamic Mesh Flow

cs.CV · 2026-05-13 · unverdicted · novelty 6.0 · 2 refs

R-DMesh proposes a VAE-based disentanglement of base mesh, motion trajectories, and rectification offset plus Triflow Attention and rectified-flow diffusion to produce 4D meshes aligned to video despite initial pose mismatch.

Velox: Learning Representations of 4D Geometry and Appearance

cs.CV · 2026-05-06 · unverdicted · novelty 6.0

Velox compresses dynamic point clouds into latent tokens that support geometry via 4D surface modeling and appearance via 3D Gaussians, showing strong results on video-to-4D generation, tracking, and image-to-4D cloth simulation.

CP4D: Compositional Physics-aware 4D Scene Generation

cs.CV · 2026-06-08 · unverdicted · novelty 5.0

CP4D generates physically consistent 4D scenes via compositional integration of pre-trained 3D models, hybrid simulator-diffusion motion synthesis, and automated scene composition.

SkelMo: Universal Skeletal Motion Generation for 3D Rigged Shapes

cs.CV · 2026-06-01 · unverdicted · novelty 5.0 · 2 refs

SkelMo introduces a category-agnostic diffusion framework for skeletal motion generation from 2D videos, trained on a new dataset of ~20,000 rigged 3D animations with a structural-semantic injection mechanism.

citing papers explorer

Showing 23 of 23 citing papers after filters.

  • Rigel3D: Rig-aware Latents for Animation-Ready 3D Asset Generation cs.GR · 2026-05-13 · unverdicted · none · ref 8

    Rigel3D jointly generates rigged 3D meshes with geometry, skeleton topology, joint positions, and skinning weights using coupled surface and skeleton latent representations for image-conditioned animation-ready asset synthesis.

  • Scene-Level Heterogeneous Physics Simulation with 3D Gaussian Splats cs.GR · 2026-06-19 · unverdicted · none · ref 27

    A Representation Abstraction Framework converts 3DGS, meshes, and fluids into unified particles for scene-level heterogeneous multi-solver physics simulation.

  • PhysAgent: Automating Physics-Based 4D Synthesis via Trajectory-Grounded Multi-Agent Feedback cs.RO · 2026-06-07 · unverdicted · none · ref 29

    PhysAgent is a simulator-in-the-loop multi-agent system that automates physically grounded 4D synthesis from multimodal prompts by using trajectory feedback from vision models and LLM reasoning to optimize force fields.

  • Functionalization via Structure Completion and Motion Rectification cs.CV · 2026-05-18 · unverdicted · none · ref 292

    Object functionalization is cast as neural graph completion over a functional graph of parts, contacts, and motions, followed by geometry realization that also rectifies erroneous motions, demonstrated on furniture with a new paired dataset.

  • AniGen: Unified $S^3$ Fields for Animatable 3D Asset Generation cs.GR · 2026-04-09 · unverdicted · none · ref 25

    AniGen directly generates animatable 3D assets with consistent shape, skeleton, and skinning from single images using unified S^3 fields and a two-stage flow-matching pipeline.

  • Action Images: End-to-End Policy Learning via Multiview Video Generation cs.CV · 2026-04-07 · unverdicted · none · ref 49

    Action Images turn robot arm motions into interpretable multiview pixel videos, letting video backbones serve as zero-shot policies for end-to-end robot learning.

  • PerpetualWonder: Long-Horizon Action-Conditioned 4D Scene Generation cs.CV · 2026-02-04 · unverdicted · none · ref 33

    PerpetualWonder introduces a closed-loop generative simulator with a unified physical-visual representation for long-horizon action-conditioned 4D scene generation from one image.

  • Alignment Is All You Need For X-to-4D Generation cs.CV · 2026-07-02 · unverdicted · none · ref 7

    Align4D introduces object distance alignment, motion-geometry joint alignment, asynchronous optimization, and the X4D dataset to achieve state-of-the-art X-to-4D generation from multimodal inputs.

  • SimWorlds: A Multi-Agent System for Dynamic 3D Scene Creation cs.AI · 2026-07-02 · unverdicted · none · ref 43

    SimWorlds presents a multi-agent system with planner-coder-reviewer workflow, layered scene protocol, and runtime inspection tools to create dynamic 4D scenes from text, plus the 4DBuildBench benchmark showing outperformance over baselines.

  • Feed-forward Motion In-betweening for Any 4D cs.CV · 2026-06-20 · unverdicted · none · ref 38

    Proposes a feed-forward keyframe-conditioned in-betweening method for arbitrary 4D meshes using a topology-agnostic VAE and MMDiT-based rectified flow model.

  • DynaTok: Token-Based 4D Reconstruction from Partial Point Clouds cs.CV · 2026-06-10 · unverdicted · none · ref 9

    DynaTok introduces a token-based framework for correspondence-free 4D reconstruction from partial point cloud sequences via latent encoding, transformer aggregation, residual decoupling, and flow-matching decoding.

  • PointAction: 3D Points as Universal Action Representations for Robot Control cs.RO · 2026-06-02 · unverdicted · none · ref 49

    PointAction uses predicted dynamic 3D pointmaps from fine-tuned video models as an embodiment-agnostic action representation to map video predictions to executable robot actions.

  • PhyGenHOI: Physically-Aware 4D Generation of Dynamic Human-Object Interactions cs.CV · 2026-05-28 · unverdicted · none · ref 4

    PhyGenHOI couples a motion diffusion model for humans with material point method simulation for objects on 3D Gaussians, using attraction loss, contact re-simulation, and masked video-SDS to produce physically consistent dynamic interactions from text.

  • Helix4D: Complex 4D Mesh Generation cs.CV · 2026-05-25 · unverdicted · none · ref 19

    Helix4D generates high-quality dynamic 4D meshes from videos by extending Trellis2 with sliding-window cross-frame attention anchored on the first frame and a repurposed 4D temporal encoding.

  • Variance Reduction for Expectations with Diffusion Teachers cs.LG · 2026-05-20 · unverdicted · none · ref 61 · 2 links

    CARV amortizes upstream diffusion teacher costs over noise resamples with timestep importance sampling and stratified-inverse-CDF sampling, delivering 2-3x effective compute gains in text-to-3D experiments and order-of-magnitude variance cuts in single-step distillation.

  • Fast 4D Mesh Generation by Spatio-Temporal Attention Chains cs.CV · 2026-05-19 · unverdicted · none · ref 56

    A training-free Spatio-Temporal Attention Chain framework accelerates 4D mesh generation 13x, improves quality, scales to 16x longer videos, and supports downstream tracking and camera estimation.

  • R-DMesh: Video-Guided 3D Animation via Rectified Dynamic Mesh Flow cs.CV · 2026-05-13 · unverdicted · none · ref 116 · 2 links

    R-DMesh proposes a VAE-based disentanglement of base mesh, motion trajectories, and rectification offset plus Triflow Attention and rectified-flow diffusion to produce 4D meshes aligned to video despite initial pose mismatch.

  • Velox: Learning Representations of 4D Geometry and Appearance cs.CV · 2026-05-06 · unverdicted · none · ref 70

    Velox compresses dynamic point clouds into latent tokens that support geometry via 4D surface modeling and appearance via 3D Gaussians, showing strong results on video-to-4D generation, tracking, and image-to-4D cloth simulation.

  • Sculpt4D: Generating 4D Shapes via Sparse-Attention Diffusion Transformers cs.CV · 2026-04-23 · unverdicted · none · ref 28

    Sculpt4D generates temporally coherent 4D shapes by integrating a block sparse attention mechanism with time-decaying mask into a pretrained 3D diffusion transformer, achieving SOTA results with 56% less computation.

  • Scene-Agnostic Object-Centric Representation Learning for 3D Gaussian Splatting cs.CV · 2026-04-10 · unverdicted · none · ref 33

    A scene-agnostic object codebook learned via unsupervised object-centric learning provides consistent identity-anchored representations for 3D Gaussians across multiple scenes.

  • CP4D: Compositional Physics-aware 4D Scene Generation cs.CV · 2026-06-08 · unverdicted · none · ref 29

    CP4D generates physically consistent 4D scenes via compositional integration of pre-trained 3D models, hybrid simulator-diffusion motion synthesis, and automated scene composition.

  • SkelMo: Universal Skeletal Motion Generation for 3D Rigged Shapes cs.CV · 2026-06-01 · unverdicted · none · ref 19 · 2 links

    SkelMo introduces a category-agnostic diffusion framework for skeletal motion generation from 2D videos, trained on a new dataset of ~20,000 rigged 3D animations with a structural-semantic injection mechanism.

  • AnimateAnyMesh++: A Flexible 4D Foundation Model for High-Fidelity Text-Driven Mesh Animation cs.CV · 2026-04-29 · unverdicted · none · ref 10

    AnimateAnyMesh++ animates arbitrary 3D meshes from text using an expanded 300K-identity DyMesh-XL dataset, a power-law topology-aware DyMeshVAE-Flex, and a variable-length rectified-flow generator to produce semantically accurate, temporally coherent animations in seconds.