pith. machine review for the scientific record. sign in

arxiv: 2604.04618 · v1 · submitted 2026-04-06 · 💻 cs.RO

Recognition: no theorem link

Biologically Inspired Event-Based Perception and Sample-Efficient Learning for High-Speed Table Tennis Robots

Authors on Pith no claims yet

Pith reviewed 2026-05-10 19:39 UTC · model grok-4.3

classification 💻 cs.RO
keywords event-based visiontable tennis robotsample-efficient learningbiologically inspired roboticshigh-speed perceptionprogressive trainingmotion detectionreinforcement learning
0
0 comments X

The pith

Event-based vision paired with progressive low-to-high speed training lets table tennis robots return balls to target 35.8 percent more accurately after the same number of practice episodes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Robots face motion blur, high latency, and data overload when trying to track fast balls with ordinary cameras, while standard learning methods demand thousands of trials before they perform well. This work copies key features of human vision and skill acquisition by feeding raw asynchronous event streams straight into a motion-and-geometry detector and by first mastering slow rallies before advancing to full speed under speed-adjusted rewards. The result is reliable ball localization without frame reconstruction and policies that converge faster than conventional reinforcement learning. If the approach holds, high-speed robotic tasks could become practical with far less computation and fewer real-world interactions.

Core claim

The paper establishes that an event-based ball detector operating directly on asynchronous streams via motion cues and geometric consistency, combined with a human-inspired curriculum that trains policies first at low speeds and then adapts them to high speeds using case-dependent temporally adaptive rewards and a reward-threshold mechanism, produces a 35.8 percent gain in return-to-target accuracy while holding the number of training episodes fixed.

What carries the argument

Event-based ball detection that processes motion cues and geometric consistency on raw asynchronous event streams, together with progressive low-to-high speed policy training guided by temporally adaptive rewards.

If this is right

  • Robots obtain low-latency, blur-free ball positions directly from event streams without reconstructing image frames.
  • Policies reach usable performance with the same training budget that previously produced lower accuracy.
  • Adaptive rewards that scale with ball speed prevent early-stage failures from derailing later high-speed learning.
  • The same perception-plus-curriculum pattern could shorten training for other fast-moving robotic skills.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar staged training could lower the sample cost of reinforcement learning in other dynamic domains such as drone racing or autonomous manipulation.
  • Event streams might allow smaller, lower-power processors on mobile robots that must react in milliseconds.
  • If the geometric consistency check generalizes, the detector could serve as a drop-in module for other sports or manufacturing tasks involving small fast objects.

Load-bearing premise

The event-based detector continues to locate the ball accurately amid real-world clutter and lighting variation, and the staged training transfers skills to high speeds without introducing new failure modes or extra tuning.

What would settle it

In a real cluttered and variably lit table tennis setup, the detection module would miss or mislocate the ball on more than 10 percent of events, or the accuracy improvement would vanish once the policy is transferred from low-speed to high-speed rallies.

Figures

Figures reproduced from arXiv: 2604.04618 by Huadong Dai, Jichao Yang, Jingyue Zhao, Lei Wang, Shi Xu, Xun Xiao, Yaohua Wang, Ziqi Wang.

Figure 1
Figure 1. Figure 1: The mechanism of table tennis in humans and robots. [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Comparison between conventional cameras and [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overview framework. The simulation environment and the algorithm are connected and exchange data via the ZMQ communication protocol. be accurately modeled as an ideal elastic interaction, we adopt a two-stage trajectory prediction approach that separately models the ball motion before and after the table bounce. In the pre-bounce stage, once five 3D position measurements have been accumulated, a polynomial… view at source ↗
Figure 4
Figure 4. Figure 4: Overall pipeline of the perception and decision-making modules. [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Ball detection pipeline illustration. (A) [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: Reward design illustration. The figure shows the reward assignments corresponding to four different ball landing cases. The boundary indicated in the figure is an artificially defined valid region; once the ball lands outside this boundary, the simulation is reset and a new serve is initiated. implementation of these components, as well as our RL setup, including exploration strategy, actor, and critic net… view at source ↗
Figure 8
Figure 8. Figure 8: Dataset illustration. (A) Imaging examples of the left and right DVS cameras in the simulated dataset. (B–H) Examples from the real-world dataset, where B, C, and D correspond to scenes containing only the table tennis ball, and the remaining scenes include interference from moving humans. (I) Examples from the RGB dataset, illustrating that RGB-based detection is more susceptible to interference caused by… view at source ↗
Figure 10
Figure 10. Figure 10: Frame-based methods require processing the entire [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗
Figure 10
Figure 10. Figure 10: Comparison of detection performance between RGB and DVS across different scenes. [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Comparison of different event-based methods on real-world datasets. [PITH_FULL_IMAGE:figures/full_fig_p011_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: We perform evaluations every 50 episodes during the [PITH_FULL_IMAGE:figures/full_fig_p011_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Performance comparison against baseline algo [PITH_FULL_IMAGE:figures/full_fig_p012_13.png] view at source ↗
Figure 15
Figure 15. Figure 15: Return-to-table rate comparison under the high [PITH_FULL_IMAGE:figures/full_fig_p012_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Comparison of target-point return errors for three [PITH_FULL_IMAGE:figures/full_fig_p013_16.png] view at source ↗
read the original abstract

Perception and decision-making in high-speed dynamic scenarios remain challenging for current robots. In contrast, humans and animals can rapidly perceive and make decisions in such environments. Taking table tennis as a typical example, conventional frame-based vision sensors suffer from motion blur, high latency and data redundancy, which can hardly meet real-time, accurate perception requirements. Inspired by the human visual system, event-based perception methods address these limitations through asynchronous sensing, high temporal resolution, and inherently sparse data representations. However, current event-based methods are still restricted to simplified, unrealistic ball-only scenarios. Meanwhile, existing decision-making approaches typically require thousands of interactions with the environment to converge, resulting in significant computational costs. In this work, we present a biologically inspired approach for high-speed table tennis robots, combining event-based perception with sample-efficient learning. On the perception side, we propose an event-based ball detection method that leverages motion cues and geometric consistency, operating directly on asynchronous event streams without frame reconstruction, to achieve robust and efficient detection in real-world rallies. On the decision-making side, we introduce a human-inspired, sample-efficient training strategy that first trains policies in low-speed scenarios, progressively acquiring skills from basic to advanced, and then adapts them to high-speed scenarios, guided by a case-dependent temporally adaptive reward and a reward-threshold mechanism. With the same training episodes, our method improves return-to-target accuracy by 35.8%. These results demonstrate the effectiveness of biologically inspired perception and decision-making for high-speed robotic systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes a biologically inspired system for high-speed table tennis robots combining event-based perception for ball detection (using motion cues and geometric consistency directly on asynchronous event streams without frame reconstruction) with a sample-efficient RL strategy. The decision-making component pre-trains policies in low-speed scenarios before progressively adapting them to high-speed conditions via case-dependent temporally adaptive rewards and a reward-threshold mechanism. The central claim is that this yields a 35.8% improvement in return-to-target accuracy over baselines under identical training episode counts.

Significance. If the empirical results hold under rigorous controls, the work could meaningfully advance sample-efficient learning for high-speed robotics and event-based vision in dynamic, cluttered settings. It explicitly credits the combination of asynchronous sensing for low-latency perception with curriculum-style progressive training that mimics human skill acquisition, potentially reducing the thousands of interactions typically needed for RL convergence in robotics. The approach targets real limitations like motion blur and data redundancy in frame-based systems.

major comments (2)
  1. Abstract: The 35.8% return-to-target accuracy improvement is stated as the key result but supplies no baseline descriptions, number of runs, error bars, statistical significance, or ablation removing the adaptive reward/threshold components. This directly undermines verification of the sample-efficiency claim under matched episode counts.
  2. Training strategy description: No analysis or ablation is provided showing that the case-dependent temporally adaptive reward and reward-threshold mechanism transfers skills from low- to high-speed without introducing new instabilities (e.g., timing mismatches or reward hacking in faster dynamics). This is load-bearing for the central claim that the curriculum itself produces the reported gain rather than protocol artifacts.
minor comments (1)
  1. The abstract would be strengthened by briefly noting quantitative metrics (e.g., detection latency or precision in real rallies) for the event-based ball detection method to support the claim of robustness beyond simplified scenarios.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our work. We address each major comment point by point below, clarifying details from the manuscript and indicating revisions made to strengthen the presentation of our results and methods.

read point-by-point responses
  1. Referee: Abstract: The 35.8% return-to-target accuracy improvement is stated as the key result but supplies no baseline descriptions, number of runs, error bars, statistical significance, or ablation removing the adaptive reward/threshold components. This directly undermines verification of the sample-efficiency claim under matched episode counts.

    Authors: We agree that the abstract would benefit from additional context for the central result. In the revised manuscript we have updated the abstract to briefly describe the baselines (standard PPO without progressive training and frame-based vision systems), note that all comparisons use identical training episode counts, and reference the experimental section for error bars, number of runs (five independent trials), and statistical significance testing. The ablation isolating the adaptive reward and threshold components is presented in Section 5.2 and Figure 6, confirming that the curriculum drives the reported gain rather than other factors. revision: yes

  2. Referee: Training strategy description: No analysis or ablation is provided showing that the case-dependent temporally adaptive reward and reward-threshold mechanism transfers skills from low- to high-speed without introducing new instabilities (e.g., timing mismatches or reward hacking in faster dynamics). This is load-bearing for the central claim that the curriculum itself produces the reported gain rather than protocol artifacts.

    Authors: We acknowledge the need for explicit validation of the transfer mechanism. We have added a dedicated ablation subsection (Section 5.3) in the revised manuscript that disables the case-dependent temporally adaptive reward and reward-threshold components. The results demonstrate that their removal introduces timing mismatches during high-speed transfer and reduces final accuracy by approximately 18%, supporting that the curriculum produces the observed improvement. We also include discussion of how speed-dependent reward scaling mitigates reward hacking. A exhaustive sweep of every conceivable instability is beyond the current scope but the added analysis directly addresses the load-bearing concern. revision: partial

Circularity Check

0 steps flagged

No significant circularity; claims rest on experimental outcomes.

full rationale

The paper describes an event-based ball detection method using motion cues and geometric consistency on asynchronous streams, plus a progressive low-to-high speed training curriculum with case-dependent adaptive rewards and a reward-threshold mechanism. The central 35.8% accuracy improvement is explicitly framed as an empirical result obtained under identical episode counts, not as a quantity derived from equations or parameters that reduce to the inputs by construction. No self-definitional steps, fitted inputs relabeled as predictions, load-bearing self-citations, uniqueness theorems, or ansatzes smuggled via prior work appear in the abstract or method outline. The derivation chain is methodological description followed by direct experimental measurement, which is self-contained against external benchmarks and does not collapse into tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper is empirical and engineering-focused; the abstract introduces no explicit free parameters, mathematical axioms, or new postulated entities.

pith-pipeline@v0.9.0 · 5592 in / 1141 out tokens · 52193 ms · 2026-05-10T19:39:03.364564+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

47 extracted references · 5 canonical work pages · 1 internal anchor

  1. [1]

    Dynamic obstacle avoidance for quadrotors with event cameras,

    D. Falanga, K. Kleber, and D. Scaramuzza, “Dynamic obstacle avoidance for quadrotors with event cameras,”Science Robotics, vol. 5, no. 40, p. eaaz9712, 2020

  2. [2]

    Fast- dynamic-vision: Detection and tracking dynamic objects with event and depth sensing,

    B. He, H. Li, S. Wu, D. Wang, Z. Zhang, Q. Dong, C. Xu, and F. Gao, “Fast- dynamic-vision: Detection and tracking dynamic objects with event and depth sensing,” in2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 3071–3078

  3. [3]

    Learning to play table tennis from scratch using muscular robots,

    D. B¨ uchler, S. Guist, R. Calandra, V. Berenz, B. Sch¨olkopf, and J. Peters, “Learning to play table tennis from scratch using muscular robots,”IEEE Transactions on Robotics, vol. 38, no. 6, pp. 3850–3860, 2022

  4. [4]

    Achieving human level competitive robot table tennis,

    D. B. DAmbrosio, S. Abeyruwan, L. Graesser, A. Iscen, H. B. Amor, A. Bewley, B. J. Reed, K. Reymann, L. Takayama, Y. Tassaet al., “Achieving human level competitive robot table tennis,” in2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2025, pp. 74–82

  5. [5]

    Robotic table tennis: A case study into a high speed learning system,

    D. B. D’ Ambrosio, J. Abelian, S. Abeyruwan, M. Ahn, A. Bewley, J. Boyd, K. Choromanski, O. Cortes, E. Coumans, T. Dinget al., “Robotic table tennis: A case study into a high speed learning system,”arXiv preprint arXiv:2309.03315, 2023

  6. [6]

    Safe table tennis swing stroke with low-cost hardware,

    F. Cursi, M. Kalander, S. Wu, X. Xue, Y. Tian, G. Tian, X. Quan, and J. Hao, “Safe table tennis swing stroke with low-cost hardware,” in2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024, pp. 18 279–18 285

  7. [7]

    A low-latency dynamic object detection algorithm fusing depth and events,

    D. Chen, L. Zhou, and C. Guo, “A low-latency dynamic object detection algorithm fusing depth and events,”Drones, vol. 9, no. 3, p. 211, 2025

  8. [8]

    Detection of fast-moving objects with neuromorphic hardware,

    A. Ziegler, K. Vetter, T. Gossard, J. Tebbe, S. Otte, and A. Zell, “Detection of fast-moving objects with neuromorphic hardware,” in2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2025, pp. 8709–8717

  9. [9]

    A 128x128 120 db 15µs latency asynchronous temporal contrast vision sensor,

    P. Lichtsteiner, C. Posch, and T. Delbruck, “A 128x128 120 db 15µs latency asynchronous temporal contrast vision sensor,”IEEE journal of solid-state circuits, vol. 43, no. 2, pp. 566–576, 2008

  10. [10]

    Jointly learning trajectory generation and hitting point prediction in robot table tennis,

    Y. Huang, D. B¨ uchler, O. Koc ¸, B. Sch¨olkopf, and J. Peters, “Jointly learning trajectory generation and hitting point prediction in robot table tennis,” in 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids). IEEE, 2016, pp. 650–655

  11. [11]

    Deep reinforcement learning that matters,

    P. Henderson, R. Islam, P. Bachman, J. Pineau, D. Precup, and D. Meger, “Deep reinforcement learning that matters,” inProceedings of the AAAI conference on artificial intelligence, vol. 32, no. 1, 2018

  12. [12]

    Sim-to-real transfer of robotic control with dynamics randomization,

    X. B. Peng, M. Andrychowicz, W. Zaremba, and P. Abbeel, “Sim-to-real transfer of robotic control with dynamics randomization,” in2018 IEEE international conference on robotics and automation (ICRA). IEEE, 2018, pp. 3803–3810

  13. [13]

    Responses of retinal rods to single photons

    D. A. Baylor, T. D. Lamb, and K.-W. Yau, “Responses of retinal rods to single photons.”The Journal of physiology, vol. 288, no. 1, pp. 613–634, 1979

  14. [14]

    The on and off channels of the visual system,

    P. H. Schiller, “The on and off channels of the visual system,”Trends in neurosciences, vol. 15, no. 3, pp. 86–92, 1992

  15. [15]

    Discharge patterns and functional organization of mam- malian retina,

    S. W. Kuffler, “Discharge patterns and functional organization of mam- malian retina,”Journal of neurophysiology, vol. 16, no. 1, pp. 37–68, 1953

  16. [16]

    How parallel are the primate visual pathways?

    W. H. Merigan and J. Maunsell, “How parallel are the primate visual pathways?”Annual review of neuroscience, 1993

  17. [17]

    Two visual systems re-viewed,

    A. D. Milner and M. A. Goodale, “Two visual systems re-viewed,” Neuropsychologia, vol. 46, no. 3, pp. 774–785, 2008

  18. [18]

    Dynamic vision sensor based gesture recognition using liquid state machine,

    X. Xiao, L. Wang, X. Chen, L. Qu, S. Guo, Y. Wang, and Z. Kang, “Dynamic vision sensor based gesture recognition using liquid state machine,” inInternational Conference on Artificial Neural Networks. Springer, 2022, pp. 618–629

  19. [19]

    An event-based perception pipeline for a table tennis robot,

    A. Ziegler, T. Gossard, A. Glover, and A. Zell, “An event-based perception pipeline for a table tennis robot,”arXiv preprint arXiv:2502.00749, 2025

  20. [20]

    Event- based vision: A survey,

    G. Gallego, T. Delbr¨ uck, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidiset al., “Event- based vision: A survey,”IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 1, pp. 154–180, 2020

  21. [21]

    A schema theory of discrete motor skill learning

    R. A. Schmidt, “A schema theory of discrete motor skill learning.” Psychological review, vol. 82, no. 4, p. 225, 1975

  22. [22]

    On the fragility of skilled performance: What governs choking under pressure?

    S. L. Beilock and T. H. Carr, “On the fragility of skilled performance: What governs choking under pressure?”Journal of experimental psychology: General, vol. 130, no. 4, p. 701, 2001

  23. [23]

    Revisiting fundamentals of experience replay,

    W. Fedus, P. Ramachandran, R. Agarwal, Y. Bengio, H. Larochelle, M. Rowland, and W. Dabney, “Revisiting fundamentals of experience replay,” inInternational conference on machine learning. PMLR, 2020, pp. 3061–3071

  24. [24]

    Selective experience replay for lifelong learning,

    D. Isele and A. Cosgun, “Selective experience replay for lifelong learning,” inProceedings of the AAAI conference on artificial intelligence, vol. 32, no. 1, 2018

  25. [25]

    Detecting moving objects in photometric images using 3d hough transform,

    B. Zhang, S. Hu, J. Du, X. Yang, X. Chen, H. Jiang, H. Cao, and S. Feng, “Detecting moving objects in photometric images using 3d hough transform,”Publications of the Astronomical Society of the Pacific, vol. 136, no. 5, p. 054502, 2024

  26. [26]

    Reliable real-time ball tracking for robot table tennis,

    S. Gomez-Gonzalez, Y. Nemmour, B. Sch ¨olkopf, and J. Peters, “Reliable real-time ball tracking for robot table tennis,”Robotics, vol. 8, no. 4, p. 90, 2019

  27. [27]

    Ping-pong robotics with high-speed vision system,

    H. Li, H. Wu, L. Lou, K. K¨ uhnlenz, and O. Ravn, “Ping-pong robotics with high-speed vision system,” in2012 12th International Conference on Control Automation Robotics & Vision (ICARCV). IEEE, 2012, pp. 106–111

  28. [28]

    A table tennis robot system using an industrial kuka robot arm,

    J. Tebbe, Y. Gao, M. Sastre-Rienietz, and A. Zell, “A table tennis robot system using an industrial kuka robot arm,” inGerman conference on pattern recognition. Springer, 2018, pp. 33–45

  29. [29]

    Adaptive robot systems in highly dynamic environments: A table tennis robot,

    J. Tebbe, “Adaptive robot systems in highly dynamic environments: A table tennis robot,” Ph.D. dissertation, Dissertation, T¨ ubingen, Universit¨at T¨ ubingen, 2022, 2022

  30. [30]

    Spikepingpong: High-frequency spike vision- based robot learning for precise striking in table tennis game,

    H. Wang, C. Hou, X. Li, Y. Fu, C. Li, N. Chen, G. Dai, J. Liu, T. Huang, and S. Zhang, “Spikepingpong: High-frequency spike vision- based robot learning for precise striking in table tennis game,”arXiv preprint arXiv:2506.06690, 2025

  31. [31]

    Event-based stereo depth estimation: A survey,

    S. Ghosh and G. Gallego, “Event-based stereo depth estimation: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

  32. [32]

    Event-based photometric bundle adjustment,

    S. Guo and G. Gallego, “Event-based photometric bundle adjustment,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

  33. [33]

    Ebsnor: Event-based snow removal by optimal dwell time thresholding,

    A. Wolf, O. Alsattam, S. Brooks-Lehnert, and K. Hirakawa, “Ebsnor: Event-based snow removal by optimal dwell time thresholding,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

  34. [34]

    Robotic table tennis with model-free reinforcement learning,

    W. Gao, L. Graesser, K. Choromanski, X. Song, N. Lazic, P. Sanketi, V. Sindhwani, and N. Jaitly, “Robotic table tennis with model-free reinforcement learning,” in2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 5556–5563

  35. [35]

    Towards high level skill learning: Learn to return table tennis ball using monte-carlo based policy gradient method,

    Y. Zhu, Y. Zhao, L. Jin, J. Wu, and R. Xiong, “Towards high level skill learning: Learn to return table tennis ball using monte-carlo based policy gradient method,” in2018 IEEE international conference on real-time computing and robotics (RCAR). IEEE, 2018, pp. 34–41

  36. [36]

    Sample-efficient reinforcement learning in robotic table tennis,

    J. Tebbe, L. Krauch, Y. Gao, and A. Zell, “Sample-efficient reinforcement learning in robotic table tennis,” in2021 IEEE international conference on robotics and automation (ICRA). IEEE, 2021, pp. 4171–4178

  37. [37]

    Low cost and latency event camera background activity denoising,

    S. Guo and T. Delbruck, “Low cost and latency event camera background activity denoising,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 1, pp. 785–795, 2022

  38. [38]

    A density-based algorithm for discovering clusters in large spatial databases with noise,

    M. Ester, H.-P. Kriegel, J. Sander, X. Xuet al., “A density-based algorithm for discovering clusters in large spatial databases with noise,” inkdd, vol. 96, no. 34, 1996, pp. 226–231

  39. [39]

    Ball trajectory tracking and prediction for a ping-pong robot,

    H.-I. Lin and Y.-C. Huang, “Ball trajectory tracking and prediction for a ping-pong robot,” in2019 9th International Conference on Information Science and Technology (ICIST). IEEE, 2019, pp. 222–227. 14

  40. [40]

    Formulation and optimization of cubic polynomial joint trajectories for industrial robots,

    C. Lin, P. Chang, and J. Luh, “Formulation and optimization of cubic polynomial joint trajectories for industrial robots,”IEEE Transactions on automatic control, vol. 28, no. 12, pp. 1066–1074, 1983

  41. [41]

    Fe fusion: a fast detection method of moving uav based on frame and event flow,

    X. Xiao, Z. Wan, Y. Li, S. Guo, J. Tie, and L. Wang, “Fe fusion: a fast detection method of moving uav based on frame and event flow,” in International Conference on Artificial Neural Networks. Springer, 2023, pp. 220–231

  42. [42]

    Trajectory prediction of spinning ball for ping-pong player robot,

    Y. Huang, D. Xu, M. Tan, and H. Su, “Trajectory prediction of spinning ball for ping-pong player robot,” in2011 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2011, pp. 3434–3439

  43. [43]

    Proximal Policy Optimization Algorithms

    J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,”arXiv preprint arXiv:1707.06347, 2017

  44. [44]

    Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU

    M. Babaeizadeh, I. Frosio, S. Tyree, J. Clemons, and J. Kautz, “Reinforce- ment learning through asynchronous advantage actor-critic on a gpu,” arXiv preprint arXiv:1611.06256, 2016

  45. [45]

    Trust region policy optimization,

    J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” inInternational conference on machine learning. PMLR, 2015, pp. 1889–1897

  46. [46]

    Addressing function approximation error in actor-critic methods,

    S. Fujimoto, H. Hoof, and D. Meger, “Addressing function approximation error in actor-critic methods,” inInternational conference on machine learning. PMLR, 2018, pp. 1587–1596

  47. [47]

    Soft actor-critic: Off- policy maximum entropy deep reinforcement learning with a stochastic actor,

    T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft actor-critic: Off- policy maximum entropy deep reinforcement learning with a stochastic actor,” inInternational conference on machine learning. Pmlr, 2018, pp. 1861–1870