Gensim2: Scaling robot data generation with multi-modal and reasoning llms.arXiv preprint arXiv:2410.03645

Pu Hua, Minghuan Liu, Annabella Macaluso, Yunfeng Lin, Weinan Zhang, Huazhe Xu · 2024 · arXiv 2410.03645

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

representative citing papers

AffordSim: A Scalable Data Generator and Benchmark for Affordance-Aware Robotic Manipulation

cs.RO · 2026-04-13 · unverdicted · novelty 6.0 · 2 refs

AffordSim integrates open-vocabulary 3D affordance prediction into simulation trajectory generation to create a 50-task benchmark that reaches 93% of manual annotation success rates and enables 24% average zero-shot success on a real Franka FR3.

R2RGEN: Real-to-Real 3D Data Generation for Spatially Generalized Manipulation

cs.RO · 2025-10-09 · unverdicted · novelty 6.0

R2RGen introduces a simulator-free three-stage pipeline that parses, augments, and post-processes real pointcloud observation-action pairs to improve spatial generalization in robotic manipulation policies.

EmbodiedClaw: Conversational Workflow Execution for Embodied AI Development

cs.RO · 2026-04-15 · unverdicted · novelty 5.0

EmbodiedClaw automates embodied AI development workflows through conversation, reducing manual effort and improving consistency and reproducibility.

Intelligent Automation for Embodied Benchmark Construction: Pipelines, Embodiments, Simulators, and Trends

cs.RO · 2026-06-10 · unverdicted · novelty 3.0

Automation in embodied benchmark construction shifts costs from acquisition toward validation, auditability, version control, and long-term governance instead of simply lowering total cost.

World Models: A Comprehensive Survey of Architectures, Methodologies, Reasoning Paradigms, and Applications

cs.LG · 2026-05-28 · unverdicted · novelty 3.0

The paper delivers a multi-axis taxonomy for world models that maps architectures, training families, reasoning strategies, and domains from early cognitive foundations through systems such as Dreamer, MuZero, and Sora while noting evaluation gaps.

From Human Videos to Robot Manipulation: A Survey on Scalable Vision-Language-Action Learning with Human-Centric Data

cs.RO · 2026-05-18 · unverdicted · novelty 3.0

The paper surveys four classes of techniques that derive action-related supervision from human videos for VLA robot models and identifies three open challenges in episode structuring, embodiment grounding, and evaluation.

Cosmos World Foundation Model Platform for Physical AI

cs.CV · 2025-01-07 · unverdicted · novelty 3.0

The Cosmos platform supplies open-source pre-trained world models and supporting tools for building fine-tunable digital world simulations to train Physical AI.

citing papers explorer

Showing 1 of 1 citing paper after filters.

World Models: A Comprehensive Survey of Architectures, Methodologies, Reasoning Paradigms, and Applications cs.LG · 2026-05-28 · unverdicted · none · ref 276
The paper delivers a multi-axis taxonomy for world models that maps architectures, training families, reasoning strategies, and domains from early cognitive foundations through systems such as Dreamer, MuZero, and Sora while noting evaluation gaps.

Gensim2: Scaling robot data generation with multi-modal and reasoning llms.arXiv preprint arXiv:2410.03645

fields

years

verdicts

representative citing papers

citing papers explorer