{"work":{"id":"62f0fb6c-e6ae-4dc4-95a4-d9dd64b240e8","openalex_id":null,"doi":null,"arxiv_id":"2310.08864","raw_key":null,"title":"Open X-Embodiment: Robotic Learning Datasets and RT-X Models","authors":null,"authors_text":"Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta","year":2023,"venue":"cs.RO","abstract":"Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train generalist X-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms. More details can be found on the project website https://robotics-transformer-x.github.io.","external_url":"https://arxiv.org/abs/2310.08864","cited_by_count":null,"metadata_source":"pith","metadata_fetched_at":"2026-05-25T04:32:56.658380+00:00","pith_arxiv_id":"2310.08864","created_at":"2026-05-08T21:49:16.156941+00:00","updated_at":"2026-06-05T21:23:00.469572+00:00","title_quality_ok":true,"display_title":"Open X-Embodiment: Robotic Learning Datasets and RT-X Models","render_title":"Open X-Embodiment: Robotic Learning Datasets and RT-X Models"},"hub":{"state":{"work_id":"62f0fb6c-e6ae-4dc4-95a4-d9dd64b240e8","tier":"super_hub","tier_reason":"100+ Pith inbound or 10,000+ external citations","pith_inbound_count":104,"external_cited_by_count":null,"distinct_field_count":5,"first_pith_cited_at":"2023-11-02T16:34:33+00:00","last_pith_cited_at":"2026-05-22T16:59:55+00:00","author_build_status":"needed","summary_status":"needed","contexts_status":"needed","graph_status":"needed","ask_index_status":"needed","reader_status":"not_needed","recognition_status":"not_needed","updated_at":"2026-06-08T04:42:48.337486+00:00","tier_text":"super_hub"},"tier":"super_hub","role_counts":[{"context_role":"background","n":22},{"context_role":"dataset","n":18},{"context_role":"baseline","n":2},{"context_role":"method","n":1}],"polarity_counts":[{"context_polarity":"background","n":23},{"context_polarity":"use_dataset","n":16},{"context_polarity":"baseline","n":3},{"context_polarity":"use_method","n":1}],"runs":{"ask_index":{"job_type":"ask_index","status":"succeeded","result":{"title":"Open X-Embodiment: Robotic Learning Datasets and RT-X Models","claims":[{"claim_text":"Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train generalist X-robot policy that can be adapted efficiently to new robots, tasks, and enviro","claim_type":"abstract","evidence_strength":"source_metadata"},{"claim_text":"action (VLA) models [8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], has shown promise in enabling generalizable skills. However, realizing the vision of omnipotent robot foundation models faces persistent challenges. Two key bottlenecks hinder progress: 1) Data scarcity: State-of-the-art models, like OpenVLA [20] and Octo [21], rely on massive datasets like the Open-X Embodiment dataset (4,000 hours) [22] or even larger corpora like the 10,000-hour dataset used by π0 & π0.5 [8, 13]. Collecting s","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"OursOpenVLART-2-XOcto RT-1-XRT-1 WidowX Robot SIMPLER Google Robot SIMPLER (Visual Matching) Google Robot SIMPLER (Variant Aggregation) 17.5 1.1 4.2 51.3 1.2 11 30.2 42.439.3 34.3 43.7 52.454.4 46.3 61.3 74.8 4.9 12.1 71.2 8.0 61.5 5.8 6.8 61.4 Realman Robot Real-World Evaluations (a) (b) (c) Figure 1. (a) Success rate (%) comparison of our model against RT-1 [7], RT-1-X [48], RT-2-X [48], Octo [62], and OpenVLA [30] across simulated benchmarks (first thee charts) and real-world evaluations (las","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"pler imitation learning paradigm in which we develop high- performing policies just by training once offline on a fixed dataset of expert task demonstrations. III. P RELIMINARIES Original OpenVLA formulation. We use OpenVLA [23] as our representative base VLA, a 7B-parameter manipulation policy created by fine-tuning the Prismatic VLM [21] on 1M episodes from the Open X-Embodiment dataset [34]. See Appendix A for architecture details. OpenVLA's original train- ing formulation uses autoregressive","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"Closed-Loop Robot Control Policy User: Wipe the table. OpenVLA: [x, , Grip] = …ΔΔθΔ Multi-Robot Control & Eﬃcient Fine-Tuning Large-Scale Robot Training Data Fully Data Weights Code Open-Source Figure 1: We present OpenVLA, a 7B-parameter open-source vision-language-action model (VLA), trained on 970k robot episodes from the Open X-Embodiment dataset [ 1]. OpenVLA sets a new state of the art for generalist robot manipulation policies. It supports controlling multiple robots out of the box and ca","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"based on autoregressive action token confidence, improving robustness in manipulation. To enhance generalization capability, we initialize HybridVLA with an internet-scale pretrained VLM [27], and design a step-by-step training approach [13, 10]. As shown in Figure 1, our model undergoes further pretraining on large, diverse, cross-embodiment robotic datasets, including Open X-Embodiment [28], DROID [29], and ROBOMIND [30], covering 760K trajectories and over 10K A800 GPU training hours. Subsequ","claim_type":"dataset","confidence":0.9,"evidence_strength":"citation_context"},{"claim_text":"RELATEDWORK A. Vision-Language-Action Models for Robot Planning Vision-language models have enabled natural language task specification and common-sense reasoning for robot plan- ning [2, 17, 28, 38]. The Robotics Transformer series [10, 11] demonstrated that web-scale pretraining transfers to robotic control, motivating large-scale cross-embodiment efforts [14, 45, 30]. Recent work has pushed toward more capable gen- eralist policies through flow matching [8], embodied reason- TABLE I:Adaptatio","claim_type":"background","confidence":0.9,"evidence_strength":"citation_context"}],"why_cited":"Pith tracks Open X-Embodiment: Robotic Learning Datasets and RT-X Models because it crossed a citation-hub threshold. Current citing contexts most often use it as background evidence (21 contexts).","role_counts":[{"n":21,"context_role":"background"},{"n":18,"context_role":"dataset"},{"n":2,"context_role":"baseline"},{"n":1,"context_role":"method"}]},"error":null,"updated_at":"2026-05-23T19:14:45.617008+00:00"},"author_expand":{"job_type":"author_expand","status":"succeeded","result":{"authors_linked":[{"id":"c02ddc4a-8d9b-4428-bf16-930ccb7d1495","orcid":null,"display_name":"Open X-Embodiment Collaboration"},{"id":"64527222-395f-4b4f-a5ba-d8b4f0d84a89","orcid":null,"display_name":"Abby O'Neill"},{"id":"335f6bd2-0c9d-497e-9fed-bc38209cc78d","orcid":null,"display_name":"Abdul Rehman"},{"id":"b53a2155-7888-4100-b454-717cef14e7ae","orcid":null,"display_name":"Abhinav Gupta"},{"id":"5a8dab45-493d-46dc-a088-9053823df692","orcid":null,"display_name":"Abhiram Maddukuri"},{"id":"88aa2508-3d42-40f8-95a7-ba87ca0eef09","orcid":null,"display_name":"Abhishek Gupta"}]},"error":null,"updated_at":"2026-05-23T19:14:46.245774+00:00"},"context_extract":{"job_type":"context_extract","status":"succeeded","result":{"enqueued_papers":25},"error":null,"updated_at":"2026-05-14T14:21:37.798913+00:00"},"graph_features":{"job_type":"graph_features","status":"succeeded","result":{"co_cited":[{"title":"OpenVLA: An Open-Source Vision-Language-Action Model","work_id":"3e7e65c5-5aed-4fe9-8414-2092bcb31cc7","shared_citers":27},{"title":"RT-1: Robotics Transformer for Real-World Control at Scale","work_id":"e11bda85-8531-46bc-a07f-d0ade3643ab1","shared_citers":22},{"title":"RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control","work_id":"ff438a8a-8003-4fae-9131-acd418b3597b","shared_citers":22},{"title":"$\\pi_0$: A Vision-Language-Action Flow Model for General Robot Control","work_id":"f790abdc-a796-482f-a40d-f8ee035ecfc2","shared_citers":19},{"title":"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware","work_id":"6fe159e0-fa73-481a-88d4-4719c15140be","shared_citers":18},{"title":"Octo: An Open-Source Generalist Robot Policy","work_id":"f9ca0722-8855-48c3-a27a-0eefb7e19253","shared_citers":16},{"title":"DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset","work_id":"13253de2-3d89-415c-8c2f-3adb25d4c337","shared_citers":14},{"title":"Do As I Can, Not As I Say: Grounding Language in Robotic Affordances","work_id":"037320f1-b0a9-4cbe-a639-bfb25409ce71","shared_citers":13},{"title":"$\\pi_{0.5}$: a Vision-Language-Action Model with Open-World Generalization","work_id":"d1ad7304-d09a-49bc-809e-846439f6aff9","shared_citers":12},{"title":"PaLM-E: An Embodied Multimodal Language Model","work_id":"5b99811a-1d93-47e2-9d59-f4045a0b74a2","shared_citers":11},{"title":"GR00T N1: An Open Foundation Model for Generalist Humanoid Robots","work_id":"e2db69c7-ee8a-4cb7-a761-7b8de1dfcf97","shared_citers":9},{"title":"3D-VLA: A 3D Vision-Language-Action Generative World Model","work_id":"aebf924c-e761-437e-9cee-f1ccc2e427bd","shared_citers":8},{"title":"Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets","work_id":"59e728c0-b6ca-4759-a8f4-02b981f2220f","shared_citers":8},{"title":"Flow Matching for Generative Modeling","work_id":"6edb71c4-5d64-40af-a394-9757ea051a36","shared_citers":8},{"title":"LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning","work_id":"662203ad-084f-42c4-8e60-977b3173755b","shared_citers":8},{"title":"Qwen3-VL Technical Report","work_id":"1fe243aa-e3c0-4da6-b391-4cbcfc88d5c0","shared_citers":8},{"title":"DINOv2: Learning Robust Visual Features without Supervision","work_id":"26b304e5-b54a-4f26-be7e-83299eca52e4","shared_citers":7},{"title":"RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation","work_id":"12319725-bc7d-4c32-a229-ad270a7460bc","shared_citers":7},{"title":"Tinyvla: To- wards fast, data-efficient vision-language-action models for robotic manipulation","work_id":"20b3dc32-113f-4f7b-8826-52a8d03fd6b3","shared_citers":7},{"title":"Diffusion Policy: Visuomotor Policy Learning via Action Diffusion","work_id":"2dce18e6-f07a-4f57-8a81-e71c3e6a293c","shared_citers":6},{"title":"Mimicgen: A data generation system for scalable robot learning using human demonstrations","work_id":"ec7924ec-9bf2-4ccd-9901-6697dc517f84","shared_citers":6},{"title":"Mo- bile aloha: Learning bimanual mobile manipulation with low-cost whole-body teleoperation","work_id":"5f6ff8ef-ed80-4c00-92c2-361c80bf8448","shared_citers":6},{"title":"Qt-opt: Scalable deep reinforcement learning for vision-based robotic manipulation","work_id":"9a64fd9b-aa3d-48af-ac99-b368b857a6e0","shared_citers":6},{"title":"RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation","work_id":"9b985126-4a2f-4bdf-b014-2a7524ec634e","shared_citers":6}],"time_series":[{"n":11,"year":2024},{"n":3,"year":2025},{"n":33,"year":2026}],"dependency_candidates":[]},"error":null,"updated_at":"2026-05-14T14:31:30.742788+00:00"},"identity_refresh":{"job_type":"identity_refresh","status":"succeeded","result":{"items":[{"title":"Qwen3 Technical Report","outcome":"unchanged","work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e","resolver":"local_arxiv","confidence":0.98,"old_work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e"}],"counts":{"fixed":0,"merged":0,"unchanged":1,"quarantined":0,"needs_external_resolution":0},"errors":[],"attempted":1},"error":null,"updated_at":"2026-05-14T14:21:47.342933+00:00"},"role_polarity":{"job_type":"role_polarity","status":"succeeded","result":{"title":"Open X-Embodiment: Robotic Learning Datasets and RT-X Models","claims":[{"claim_text":"Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train generalist X-robot policy that can be adapted efficiently to new robots, tasks, and enviro","claim_type":"abstract","evidence_strength":"source_metadata"},{"claim_text":"action (VLA) models [8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], has shown promise in enabling generalizable skills. However, realizing the vision of omnipotent robot foundation models faces persistent challenges. Two key bottlenecks hinder progress: 1) Data scarcity: State-of-the-art models, like OpenVLA [20] and Octo [21], rely on massive datasets like the Open-X Embodiment dataset (4,000 hours) [22] or even larger corpora like the 10,000-hour dataset used by π0 & π0.5 [8, 13]. Collecting s","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"OursOpenVLART-2-XOcto RT-1-XRT-1 WidowX Robot SIMPLER Google Robot SIMPLER (Visual Matching) Google Robot SIMPLER (Variant Aggregation) 17.5 1.1 4.2 51.3 1.2 11 30.2 42.439.3 34.3 43.7 52.454.4 46.3 61.3 74.8 4.9 12.1 71.2 8.0 61.5 5.8 6.8 61.4 Realman Robot Real-World Evaluations (a) (b) (c) Figure 1. (a) Success rate (%) comparison of our model against RT-1 [7], RT-1-X [48], RT-2-X [48], Octo [62], and OpenVLA [30] across simulated benchmarks (first thee charts) and real-world evaluations (las","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"pler imitation learning paradigm in which we develop high- performing policies just by training once offline on a fixed dataset of expert task demonstrations. III. P RELIMINARIES Original OpenVLA formulation. We use OpenVLA [23] as our representative base VLA, a 7B-parameter manipulation policy created by fine-tuning the Prismatic VLM [21] on 1M episodes from the Open X-Embodiment dataset [34]. See Appendix A for architecture details. OpenVLA's original train- ing formulation uses autoregressive","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"Closed-Loop Robot Control Policy User: Wipe the table. OpenVLA: [x, , Grip] = …ΔΔθΔ Multi-Robot Control & Eﬃcient Fine-Tuning Large-Scale Robot Training Data Fully Data Weights Code Open-Source Figure 1: We present OpenVLA, a 7B-parameter open-source vision-language-action model (VLA), trained on 970k robot episodes from the Open X-Embodiment dataset [ 1]. OpenVLA sets a new state of the art for generalist robot manipulation policies. It supports controlling multiple robots out of the box and ca","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"based on autoregressive action token confidence, improving robustness in manipulation. To enhance generalization capability, we initialize HybridVLA with an internet-scale pretrained VLM [27], and design a step-by-step training approach [13, 10]. As shown in Figure 1, our model undergoes further pretraining on large, diverse, cross-embodiment robotic datasets, including Open X-Embodiment [28], DROID [29], and ROBOMIND [30], covering 760K trajectories and over 10K A800 GPU training hours. Subsequ","claim_type":"dataset","confidence":0.9,"evidence_strength":"citation_context"},{"claim_text":"RELATEDWORK A. Vision-Language-Action Models for Robot Planning Vision-language models have enabled natural language task specification and common-sense reasoning for robot plan- ning [2, 17, 28, 38]. The Robotics Transformer series [10, 11] demonstrated that web-scale pretraining transfers to robotic control, motivating large-scale cross-embodiment efforts [14, 45, 30]. Recent work has pushed toward more capable gen- eralist policies through flow matching [8], embodied reason- TABLE I:Adaptatio","claim_type":"background","confidence":0.9,"evidence_strength":"citation_context"}],"why_cited":"Pith tracks Open X-Embodiment: Robotic Learning Datasets and RT-X Models because it crossed a citation-hub threshold. Current citing contexts most often use it as background evidence (21 contexts).","role_counts":[{"n":21,"context_role":"background"},{"n":18,"context_role":"dataset"},{"n":2,"context_role":"baseline"},{"n":1,"context_role":"method"}]},"error":null,"updated_at":"2026-05-23T19:14:46.250411+00:00"},"summary_claims":{"job_type":"summary_claims","status":"succeeded","result":{"title":"Open X-Embodiment: Robotic Learning Datasets and RT-X Models","claims":[{"claim_text":"Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train generalist X-robot policy that can be adapted efficiently to new robots, tasks, and enviro","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks Open X-Embodiment: Robotic Learning Datasets and RT-X Models because it crossed a citation-hub threshold.","role_counts":[]},"error":null,"updated_at":"2026-05-14T14:31:28.859136+00:00"}},"summary":{"title":"Open X-Embodiment: Robotic Learning Datasets and RT-X Models","claims":[{"claim_text":"Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train generalist X-robot policy that can be adapted efficiently to new robots, tasks, and enviro","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks Open X-Embodiment: Robotic Learning Datasets and RT-X Models because it crossed a citation-hub threshold.","role_counts":[]},"graph":{"co_cited":[{"title":"OpenVLA: An Open-Source Vision-Language-Action Model","work_id":"3e7e65c5-5aed-4fe9-8414-2092bcb31cc7","shared_citers":27},{"title":"RT-1: Robotics Transformer for Real-World Control at Scale","work_id":"e11bda85-8531-46bc-a07f-d0ade3643ab1","shared_citers":22},{"title":"RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control","work_id":"ff438a8a-8003-4fae-9131-acd418b3597b","shared_citers":22},{"title":"$\\pi_0$: A Vision-Language-Action Flow Model for General Robot Control","work_id":"f790abdc-a796-482f-a40d-f8ee035ecfc2","shared_citers":19},{"title":"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware","work_id":"6fe159e0-fa73-481a-88d4-4719c15140be","shared_citers":18},{"title":"Octo: An Open-Source Generalist Robot Policy","work_id":"f9ca0722-8855-48c3-a27a-0eefb7e19253","shared_citers":16},{"title":"DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset","work_id":"13253de2-3d89-415c-8c2f-3adb25d4c337","shared_citers":14},{"title":"Do As I Can, Not As I Say: Grounding Language in Robotic Affordances","work_id":"037320f1-b0a9-4cbe-a639-bfb25409ce71","shared_citers":13},{"title":"$\\pi_{0.5}$: a Vision-Language-Action Model with Open-World Generalization","work_id":"d1ad7304-d09a-49bc-809e-846439f6aff9","shared_citers":12},{"title":"PaLM-E: An Embodied Multimodal Language Model","work_id":"5b99811a-1d93-47e2-9d59-f4045a0b74a2","shared_citers":11},{"title":"GR00T N1: An Open Foundation Model for Generalist Humanoid Robots","work_id":"e2db69c7-ee8a-4cb7-a761-7b8de1dfcf97","shared_citers":9},{"title":"3D-VLA: A 3D Vision-Language-Action Generative World Model","work_id":"aebf924c-e761-437e-9cee-f1ccc2e427bd","shared_citers":8},{"title":"Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets","work_id":"59e728c0-b6ca-4759-a8f4-02b981f2220f","shared_citers":8},{"title":"Flow Matching for Generative Modeling","work_id":"6edb71c4-5d64-40af-a394-9757ea051a36","shared_citers":8},{"title":"LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning","work_id":"662203ad-084f-42c4-8e60-977b3173755b","shared_citers":8},{"title":"Qwen3-VL Technical Report","work_id":"1fe243aa-e3c0-4da6-b391-4cbcfc88d5c0","shared_citers":8},{"title":"DINOv2: Learning Robust Visual Features without Supervision","work_id":"26b304e5-b54a-4f26-be7e-83299eca52e4","shared_citers":7},{"title":"RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation","work_id":"12319725-bc7d-4c32-a229-ad270a7460bc","shared_citers":7},{"title":"Tinyvla: To- wards fast, data-efficient vision-language-action models for robotic manipulation","work_id":"20b3dc32-113f-4f7b-8826-52a8d03fd6b3","shared_citers":7},{"title":"Diffusion Policy: Visuomotor Policy Learning via Action Diffusion","work_id":"2dce18e6-f07a-4f57-8a81-e71c3e6a293c","shared_citers":6},{"title":"Mimicgen: A data generation system for scalable robot learning using human demonstrations","work_id":"ec7924ec-9bf2-4ccd-9901-6697dc517f84","shared_citers":6},{"title":"Mo- bile aloha: Learning bimanual mobile manipulation with low-cost whole-body teleoperation","work_id":"5f6ff8ef-ed80-4c00-92c2-361c80bf8448","shared_citers":6},{"title":"Qt-opt: Scalable deep reinforcement learning for vision-based robotic manipulation","work_id":"9a64fd9b-aa3d-48af-ac99-b368b857a6e0","shared_citers":6},{"title":"RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation","work_id":"9b985126-4a2f-4bdf-b014-2a7524ec634e","shared_citers":6}],"time_series":[{"n":11,"year":2024},{"n":3,"year":2025},{"n":33,"year":2026}],"dependency_candidates":[]},"authors":[{"id":"64527222-395f-4b4f-a5ba-d8b4f0d84a89","orcid":null,"display_name":"Abby O'Neill","source":"manual","import_confidence":0.72},{"id":"335f6bd2-0c9d-497e-9fed-bc38209cc78d","orcid":null,"display_name":"Abdul Rehman","source":"manual","import_confidence":0.72},{"id":"b53a2155-7888-4100-b454-717cef14e7ae","orcid":null,"display_name":"Abhinav Gupta","source":"manual","import_confidence":0.72},{"id":"5a8dab45-493d-46dc-a088-9053823df692","orcid":null,"display_name":"Abhiram Maddukuri","source":"manual","import_confidence":0.72},{"id":"88aa2508-3d42-40f8-95a7-ba87ca0eef09","orcid":null,"display_name":"Abhishek Gupta","source":"manual","import_confidence":0.72},{"id":"c02ddc4a-8d9b-4428-bf16-930ccb7d1495","orcid":null,"display_name":"Open X-Embodiment Collaboration","source":"manual","import_confidence":0.72}]}}