arxiv: 2605.01232 · v1 · submitted 2026-05-02 · 💻 cs.RO

Recognition: unknown

A Principled Approach for Creating High-fidelity Synthetic Demonstrations for Imitation Learning

Moniruzzaman Akash , Momotaz Begum

Authors on Pith no claims yet

Pith reviewed 2026-05-09 14:34 UTC · model grok-4.3

classification 💻 cs.RO

keywords imitation learningsynthetic demonstrationsdynamic movement primitives3D Gaussian Splattingrobot manipulationtrajectory retargetingvisuomotor policies

0 comments

The pith

By modeling expert trajectories as dynamic movement primitives and retargeting them in 3D Gaussian Splatting scenes, the approach generates synthetic demonstrations that better preserve motion structure for improved imitation learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that demonstration synthesis for imitation learning performs better when it keeps the expert trajectory as a strong prior rather than allowing large deviations through planners or optimizers. It does so by fitting the expert path to Dynamic Movement Primitives, then retargeting those primitives to new goals, object positions, and viewpoints inside a 3D Gaussian Splatting reconstruction while adding analytic obstacle avoidance on the scene density field. This matters because contact-rich and shape-sensitive manipulation tasks lose useful structure when trajectories stray from the expert, reducing the value of added diversity for downstream policy training. The method produces lower-deviation, lower-collision trajectories that yield higher task success when used to train diffusion-based visuomotor policies on a mobile manipulator.

Core claim

Demonstration synthesis should treat the expert trajectory as a strong prior. Modeling the expert trajectory using Dynamic Movement Primitives (DMPs) and retargeting it to new goals, object configurations, and viewpoints within a reconstructed 3DGS scene yields phase-consistent, shape-preserving motion by construction. To safely realize this expert-preserving diversity in cluttered scenes, an analytic obstacle-aware DMP formulation that operates directly on the continuous density field induced by the 3DGS representation enables collision avoidance while minimally perturbing the nominal expert motion.

What carries the argument

Dynamic Movement Primitives retargeted to new configurations inside a 3D Gaussian Splatting scene, combined with an analytic obstacle-aware formulation that acts on the scene's density field.

If this is right

Trajectories show lower deviation from the expert path than those generated by sampling-based planners or trajectory optimization.
Collision rates drop in cluttered environments while keeping the original motion shape.
Diffusion-based visuomotor policies trained on the data reach higher task success rates.
Photorealistic rendering and geometric reasoning are unified without requiring separate scene representations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same retargeting principle could reduce the volume of real demonstrations needed for other contact-sensitive tasks such as assembly or insertion.
If phase consistency holds across longer horizons, the approach might limit error compounding when policies are rolled out in sequence.
Online updates to the 3DGS model during execution could allow the retargeting step to adapt demonstrations on the fly to changing object positions.

Load-bearing premise

Preserving the expert trajectory's spatial and temporal structure via DMP retargeting is critical for contact-rich and shape-sensitive manipulation tasks and the analytic obstacle-aware formulation can achieve collision avoidance with only minimal perturbation to the nominal motion.

What would settle it

A side-by-side policy training trial on the same three manipulation tasks where the DMP-retargeted demonstrations produce equal or lower success rates than planner- or optimization-based demonstrations would falsify the performance claim.

Figures

Figures reproduced from arXiv: 2605.01232 by Momotaz Begum, Moniruzzaman Akash.

**Figure 1.** Figure 1: Existing 3DGS-based demonstration generation synthesizes motion via point to point interpolated planning [30] or view at source ↗

**Figure 3.** Figure 3: FTE pipeline. A single teleoperated demonstration is encoded as a DMP trajectory prior and retargeted within an aligned view at source ↗

**Figure 4.** Figure 4: (a) Perturbed boundary points P0–P3 (orange) on the expert path (blue). (b) Synthesized DMP rollouts (gray) retargeted to these perturbations. To generate diverse demonstrations, we perturb the boundary poses of each segment while keeping the learned forcing term fixed. For a segment with start y0 and goal g, we sample g˜ = g + ∆p, q˜ = q ⊗ ∆q, (9) where ∆p and ∆q are drawn from user-specified bounded gau… view at source ↗

**Figure 5.** Figure 5: Obstacle-aware retargeting using a 3DGS density field. view at source ↗

**Figure 6.** Figure 6: We choose 3 tasks where the trajectory encodes view at source ↗

**Figure 7.** Figure 7: Synthesized and Rollout Trajectories across tasks. view at source ↗

**Figure 8.** Figure 8: Dynamic Time Warping with respect to Expert Demo view at source ↗

read the original abstract

Recent advances in 3D Gaussian Splatting (3DGS) have enabled visually realistic demonstration generation from a single expert trajectory and a short multi-view scan. However, existing 3DGS-based synthesis pipelines typically generate new motions using sampling-based planners or trajectory optimization, which often deviate substantially from the expert's demonstrated path. While such deviations may be acceptable for tasks insensitive to motion shape, they discard subtle spatial and temporal structure that is critical for contact-rich and shape-sensitive manipulation, causing increased demonstration diversity to harm downstream policy learning. We argue that demonstration synthesis should treat the expert trajectory as a strong prior. Building on this principle, we propose a framework that synthesizes diverse task demonstrations while explicitly preserving expert motion structure. We model the expert trajectory using Dynamic Movement Primitives (DMPs) and retarget it to new goals, object configurations, and viewpoints within a reconstructed 3DGS scene, yielding phase-consistent, shape-preserving motion by construction. To safely realize this expert-preserving diversity in cluttered scenes, we introduce an analytic obstacle-aware DMP formulation that operates directly on the continuous density field induced by the 3DGS representation. This enables collision avoidance while minimally perturbing the nominal expert motion, unifying photorealistic rendering and geometric reasoning without additional scene representations. We evaluate our approach on a Spot mobile manipulator across three manipulation tasks with increasing sensitivity to trajectory fidelity. Compared to planner- and optimization-based synthesis, our method produces trajectories with lower deviation and collision rates and yields higher task success when training diffusion-based visuomotor policies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Retargeting DMPs inside 3DGS scenes to keep expert motion shape while adding analytic avoidance is a sensible idea, but the claim that it delivers only minimal perturbation on contact-rich tasks needs stronger checks.

read the letter

The paper's core move is to fit DMPs to an expert trajectory, retarget the goal and phase to new object placements and viewpoints inside a 3DGS reconstruction, then layer an analytic obstacle force drawn from the continuous density field. This is meant to generate varied demonstrations that stay close to the original spatial and temporal structure instead of letting planners or optimizers wander off. That combination is new enough to notice; prior work used DMPs or 3DGS separately, but not this way for structure-preserving synthesis in clutter. The motivation is also on target: for contact-rich or shape-sensitive manipulation, extra diversity that changes the path too much can hurt downstream diffusion policies more than it helps. They run the method on a Spot arm across three tasks and report lower deviation, fewer collisions, and better policy success than the planner baselines. That direction is worth testing. The soft spot is exactly the one the stress-test flags. When object configurations shift, the required approach geometry or contact points can change enough that simple goal retargeting plus a density-based avoidance term no longer keeps the trajectory shape intact. The paper treats the retargeted DMP as a strong prior that only needs light correction, but nothing in the abstract shows that the forcing function and obstacle term actually deliver phase-consistent, low-deviation paths on the shape-sensitive cases. Without quantitative fidelity metrics, timing plots, or failure cases on contact points, it is hard to know whether the central assumption survives. The evaluation claims are stated positively but lack the numbers, baselines, or statistical detail needed to judge robustness. This work is aimed at robot learning groups that already use imitation or diffusion policies and want cheaper ways to expand demonstration sets without losing motion quality. A reader who cares about DMP extensions or 3DGS for geometric reasoning would find the framing useful even if the results need tightening. I would send it to peer review. The idea is coherent and the problem it attacks is real; the execution just needs the evidence to match the claim.

Referee Report

1 major / 1 minor

Summary. The paper claims that modeling expert trajectories with Dynamic Movement Primitives (DMPs) and retargeting them to new goals, object configurations, and viewpoints in a 3D Gaussian Splatting (3DGS) scene, combined with an analytic obstacle-aware DMP formulation operating on the 3DGS density field, produces diverse synthetic demonstrations that preserve expert spatial and temporal structure better than planner- or optimization-based methods. This leads to lower trajectory deviation and collision rates, and higher success when training diffusion-based visuomotor policies, as demonstrated on three manipulation tasks with increasing fidelity sensitivity using a Spot mobile manipulator.

Significance. If the results hold, the work provides a principled alternative to sampling-based or optimization-driven demonstration synthesis by treating the expert trajectory as a strong prior via DMP retargeting, which is particularly relevant for contact-rich and shape-sensitive manipulation where motion structure matters. The integration of photorealistic 3DGS rendering with continuous geometric reasoning in the obstacle term is a clear strength, as is the real-robot evaluation across tasks of varying sensitivity. This could reduce reliance on extensive real data collection while supporting better downstream imitation learning performance.

major comments (1)

[Abstract] Abstract: The central claim that DMP retargeting 'yields phase-consistent, shape-preserving motion by construction' is load-bearing for the argument that this approach is superior for contact-rich tasks. However, standard DMP goal retargeting (fixed forcing function, shifted attractor) can alter curvature, timing, and workspace paths when new object positions change required approach geometry or contact points, even prior to applying the obstacle term; the analytic obstacle-aware formulation on the 3DGS density field does not explicitly constrain object-interaction geometry, so the 'minimal perturbation' guarantee may not hold on the targeted tasks.

minor comments (1)

[Abstract] Abstract: The positive comparative results are stated without any quantitative metrics, baseline details, statistical tests, or references to specific tables/figures; including at least summary numbers (e.g., mean deviation, collision rates, success percentages) would strengthen the abstract.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive and detailed review. The major comment raises an important point about the guarantees of DMP retargeting. We address it directly below with clarifications on how our method maintains motion structure while handling new configurations.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that DMP retargeting 'yields phase-consistent, shape-preserving motion by construction' is load-bearing for the argument that this approach is superior for contact-rich tasks. However, standard DMP goal retargeting (fixed forcing function, shifted attractor) can alter curvature, timing, and workspace paths when new object positions change required approach geometry or contact points, even prior to applying the obstacle term; the analytic obstacle-aware formulation on the 3DGS density field does not explicitly constrain object-interaction geometry, so the 'minimal perturbation' guarantee may not hold on the targeted tasks.

Authors: We agree that standard DMP goal retargeting shifts the attractor and can modify absolute workspace paths. However, the fixed forcing function learned from the expert trajectory preserves the demonstrated shape, phase, and relative timing by construction, which is the key property we leverage. For new object configurations, we explicitly adjust the goal position to align with the required contact or approach geometry in the updated scene (derived from the 3DGS reconstruction), ensuring interaction points remain consistent before the obstacle term is applied. The analytic obstacle-aware formulation then adds only the minimal perturbation needed to avoid collisions with the continuous density field, without altering the core DMP dynamics. Our experiments on three Spot tasks (with increasing fidelity sensitivity) show lower trajectory deviation and higher policy success than planner-based methods, empirically supporting better structure preservation. We will revise the abstract and method section to explicitly describe the goal adjustment step for object configurations and clarify the scope of the 'minimal perturbation' guarantee. revision: partial

Circularity Check

0 steps flagged

No circularity: standard DMP retargeting and empirical validation form an independent derivation chain

full rationale

The paper models expert trajectories via DMPs, retargets them to altered goals and scenes while claiming phase-consistent shape preservation by construction of the DMP equations, and augments with an analytic obstacle term on the 3DGS density field. These steps invoke well-known DMP properties and a new but explicitly derived analytic formulation rather than fitting parameters to the target metric or reducing via self-citation. Superiority claims rest on direct robot experiments measuring deviation, collisions, and downstream policy success, which are falsifiable outside the method definition itself. No load-bearing step equates its output to its input by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract, the approach relies on standard assumptions from DMP literature (that motions can be represented and retargeted via differential equations) and 3DGS (that a density field supports geometric reasoning). No explicit free parameters, new axioms, or invented entities are detailed.

pith-pipeline@v0.9.0 · 5581 in / 1272 out tokens · 31171 ms · 2026-05-09T14:34:17.819369+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

35 extracted references · 14 canonical work pages

[1]

Argall, Sonia Chernova, Manuela Veloso, and Brett Browning

Brenna D. Argall, Sonia Chernova, Manuela Veloso, and Brett Browning. A survey of robot learning from demonstration.Robotics and Autonomous Systems, 57 (5):469–483, 2009

2009
[2]

Orientation in cartesian space dynamic movement primitives

A ˇleˇs Ude, Bojan Nemec, Tadej Petri ˇc, and Jun Mori- moto. Orientation in cartesian space dynamic movement primitives. InProceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages 2997–3004, Hong Kong, China, 2014. Includes quater- nion log/exp formulation for orientation DMPs

2014
[3]

An invitation to imitation

Andrew Bagnell and March. An invitation to imitation
[4]

URL https://api.semanticscholar.org/CorpusID: 6148715
[5]

A method for registration of 3-d shapes.IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2):239–256, 1992

Paul J Besl and Neil D McKay. A method for registration of 3-d shapes.IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2):239–256, 1992

1992
[6]

Rovi-aug: Robot and viewpoint augmentation for cross-embodiment robot learning.arXiv preprint arXiv:2409.03403, 2024

Lawrence Yunliang Chen, Chenfeng Xu, Karthik Dhar- marajan, Muhammad Zubair Irshad, Richard Cheng, Kurt Keutzer, Masayoshi Tomizuka, Quan Vuong, and Ken Goldberg. Rovi-aug: Robot and viewpoint augmentation for cross-embodiment robot learning.arXiv preprint arXiv:2409.03403, 2024

work page arXiv 2024
[7]

Splat-nav: Safe real-time robot navigation in gaussian splatting maps.arXiv preprint arXiv:2403.02751,

Timothy Chen, Ola Shorinwa, Weijia Zeng, Joseph Bruno, Philip Dames, and Mac Schwager. Splat-nav: Safe real-time robot navigation in gaussian splatting maps. arXiv preprint arXiv:2403.02751, 2024

work page arXiv 2024
[8]

Dif- fusion policy: Visuomotor policy learning via action diffusion

Cheng Chi, Siyuan Feng, Yilun Du, Zhenjia Xu, Eric Cousineau, Benjamin Burchfiel, and Shuran Song. Dif- fusion policy: Visuomotor policy learning via action diffusion. InProceedings of Robotics: Science and Systems (RSS), 2023

2023
[9]

Online task segmentation by merging symbolic and data-driven skill recognition during kines- thetic teaching.Robotics and Autonomous Systems, 162: 104367, 2023

Thomas Eiband, Johanna Liebl, Christoph Willibald, and Dongheui Lee. Online task segmentation by merging symbolic and data-driven skill recognition during kines- thetic teaching.Robotics and Autonomous Systems, 162: 104367, 2023. doi: 10.1016/j.robot.2023.104367

work page doi:10.1016/j.robot.2023.104367 2023
[10]

Du Q. Huynh. Metrics for 3d rotations: Comparison and analysis.Journal of Mathematical Imaging and Vision, 35(2):155–164, 2009. doi: 10.1007/s10851-009-0161-2

work page doi:10.1007/s10851-009-0161-2 2009
[11]

Learning attractor landscapes for learning motor prim- itives

Auke Jan Ijspeert, Jun Nakanishi, and Stefan Schaal. Learning attractor landscapes for learning motor prim- itives. In Susan Becker, Sebastian Thrun, and Klaus Obermayer, editors,Advances in Neural Information Pro- cessing Systems 15 (NIPS 2002), pages 1547–1554, Cam- bridge, MA, 2003. MIT Press. URL http://www-clmc. usc.edu/publications/I/ijspeert-NIPS20...

2002
[12]

Neural Computation , volume=

Auke Jan Ijspeert, Jun Nakanishi, Heiko Hoffmann, Peter Pastor, and Stefan Schaal. Dynamical movement primitives: Learning attractor models for motor behav- iors.Neural Computation, 25(2):328–373, 2013. doi: 10.1162/NECO a 00393

work page doi:10.1162/neco 2013
[13]

Sim-to-real via sim-to-sim: Data-efficient robotic grasp- ing via randomized-to-canonical adaptation networks

Stephen James, Edward Johns, and Andrew J Davison. Sim-to-real via sim-to-sim: Data-efficient robotic grasp- ing via randomized-to-canonical adaptation networks. In CVPR, 2019

2019
[14]

ACM Transactions on Graphics (TOG) , volume =

Thomas Kerbl, Grigory Sattler, Michael Rheinberger, and Gordon Wetzstein. 3D Gaussian Splatting for Real- Time Radiance Field Rendering.ACM Transactions on Graphics (Proc. SIGGRAPH), 42(4):75:1–75:14, 2023. doi: 10.1145/3592415

work page doi:10.1145/3592415 2023
[15]

Grupen, and Andrew G

George Dimitri Konidaris, Scott Kuindersma, Roderic A. Grupen, and Andrew G. Barto. Robot learning from demonstration by constructing skill trees.The Inter- national Journal of Robotics Research, 31(3):360–375,
[16]

doi: 10.1177/0278364911428653

work page doi:10.1177/0278364911428653
[17]

DDCO: Discovery of deep continuous options for robot learning from demonstrations

Sanjay Krishnan, Roy Fox, Ion Stoica, and Ken Gold- berg. DDCO: Discovery of deep continuous options for robot learning from demonstrations. InProceedings of the 1st Conference on Robot Learning (CoRL), 2017

2017
[18]

Robogsim: A real2sim2real robotic gaussian splatting simulator,

Xinhai Li, Jialin Li, Ziheng Zhang, Rui Zhang, Fan Jia, Tiancai Wang, Haoqiang Fan, Kuo-Kun Tseng, and Ruip- ing Wang. Robogsim: A real2sim2real robotic gaussian splatting simulator.arXiv preprint arXiv:2411.11839, 2024

work page arXiv 2024
[19]

What matters in learning from offline human demonstra- tions for robot manipulation

Ajay Mandlekar, Danfei Xu, Josiah Wong, Soroush Nasiriany, Chen Wang, Rohun Kulkarni, Li Fei-Fei, Silvio Savarese, Yuke Zhu, and Roberto Mart ´ın-Mart´ın. What matters in learning from offline human demonstra- tions for robot manipulation. InConference on Robot Learning (CoRL), 2021

2021
[20]

Srinivasan, Matthew Tancik, Jonathan T

Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as radiance fields for view synthesis. InEuropean Conference on Computer Vi- sion (ECCV), pages 405–421, 2020. doi: 10.1007/ 978-3-030-58452-8 24

2020
[21]

Konidaris, Sachin Chitta, Bhaskara Marthi, and Andrew G

Scott Niekum, Sarah Osentoski, George D. Konidaris, Sachin Chitta, Bhaskara Marthi, and Andrew G. Barto. Learning grounded finite-state representations from un- structured demonstrations.The International Journal of Robotics Research, 34(2):131–157, 2015

2015
[22]

One demo is worth a thousand trajectories: Action-view augmentation for visuomotor policies

Chuer Pan, Litian Liang, Dominik Bauer, Eric Cousineau, Benjamin Burchfiel, Siyuan Feng, and Shuran Song. One demo is worth a thousand trajectories: Action-view augmentation for visuomotor policies. In Conference on Robot Learning (CoRL), 2025

2025
[23]

Movement reproduction and obstacle avoid- ance with dynamic movement primitives and potential fields

Dae-Hyung Park, Heiko Hoffmann, Peter Pastor, and Ste- fan Schaal. Movement reproduction and obstacle avoid- ance with dynamic movement primitives and potential fields. In2008 8th IEEE-RAS International Conference on Humanoid Robots, pages 91–98, 2008

2008
[24]

Learning and generalization of motor skills by learning from demonstration

Peter Pastor, Heiko Hoffmann, Tamim Asfour, and Stefan Schaal. Learning and generalization of motor skills by learning from demonstration. InProceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages 763–768, Kobe, Japan, 2009

2009
[25]

Splat- sim: Zero-shot sim2real transfer of rgb manipulation poli- cies using gaussian splatting

M Nomaan Qureshi, Sparsh Garg, Francisco Yandun, David Held, George Kantor, and Abhisesh Silwal. Splat- sim: Zero-shot sim2real transfer of rgb manipulation poli- cies using gaussian splatting. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 6502–6509. IEEE, 2025

2025
[26]

Cad2rl: Real single-image flight without a single real image

Fereshteh Sadeghi and Sergey Levine. Cad2rl: Real single-image flight without a single real image. In Robotics: Science and Systems (RSS), 2016

2016
[27]

Dynamic programming algorithm optimization for spoken word recognition

Hiroaki Sakoe and Seibi Chiba. Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 26(1):43–49, 1978

1978
[28]

Dynamic movement primitives -a frame- work for motor control in humans and humanoid robotics

Stefan Schaal. Dynamic movement primitives -a frame- work for motor control in humans and humanoid robotics. In Hajime Kimura, Kazuo Tsuchiya, Akio Ishiguro, and Holk Witte, editors,Adaptive Motion of Animals and Machines, pages 261–280. Springer, Tokyo,
[29]

doi: 10.1007/4-431-31381-8 23

work page doi:10.1007/4-431-31381-8
[30]

Domain ran- domization for transferring deep neural networks from simulation to the real world

Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, and Pieter Abbeel. Domain ran- domization for transferring deep neural networks from simulation to the real world. InIEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 23–30, 2017. doi: 10.1109/IROS.2017.8202133

work page doi:10.1109/iros.2017.8202133 2017
[31]

Reconciling reality through simulation: A real- to-sim-to-real approach for robust manipulation,

Marcel Torne, Anthony Simeonov, Zechu Li, April Chan, Tao Chen, Abhishek Gupta, and Pulkit Agrawal. Rec- onciling reality through simulation: A real-to-sim-to- real approach for robust manipulation.arXiv preprint arXiv:2403.03949, 2024

work page arXiv 2024
[32]

Demogen: Syn- thetic demonstration generation for data-efficient visuo- motor policy learning.arXiv preprint arXiv:2502.16932, 2025

Zhengrong Xue, Shuying Deng, Zhenyang Chen, Yixuan Wang, Zhecheng Yuan, and Huazhe Xu. Demogen: Syn- thetic demonstration generation for data-efficient visuo- motor policy learning.arXiv preprint arXiv:2502.16932, 2025

work page arXiv 2025
[33]

Novel demonstration generation with gaussian splatting enables robust one-shot manipulation,

Sizhe Yang, Wenye Yu, Jia Zeng, Jun Lv, Kerui Ren, Cewu Lu, Dahua Lin, and Jiangmiao Pang. Novel demonstration generation with gaussian splatting en- ables robust one-shot manipulation.arXiv preprint arXiv:2504.13175, 2025

work page arXiv 2025
[34]

Roboengine: Plug-and-play robot data augmentation with semantic robot segmentation and background generation,

Chengbo Yuan, Suraj Joshi, Shaoting Zhu, Hang Su, Hang Zhao, and Yang Gao. Roboengine: Plug-and- play robot data augmentation with semantic robot seg- mentation and background generation.arXiv preprint arXiv:2503.18738, 2025

work page arXiv 2025
[35]

A survey of imitation learning: Al- gorithms, recent developments, and challenges.IEEE Transactions on Cybernetics, 2024

Maryam Zare, Parham M Kebria, Abbas Khosravi, and Saeid Nahavandi. A survey of imitation learning: Al- gorithms, recent developments, and challenges.IEEE Transactions on Cybernetics, 2024

2024