Recognition: no theorem link
Precise Aggressive Aerial Maneuvers with Sensorimotor Policies
Pith reviewed 2026-05-10 18:57 UTC · model grok-4.3
The pith
Sensorimotor policies trained via reinforcement learning enable quadrotors to traverse narrow gaps tilted up to 90 degrees with 5 cm clearance using only onboard sensors.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors claim that sensorimotor policies, trained end-to-end with reinforcement learning and policy distillation in simulation after initialization with model-based planner trajectories, allow a quadrotor to perform precise aggressive maneuvers through narrow rectangular gaps. These include passages with only 5 cm clearance at up to 90-degree tilt, without any prior information on the gap's location or orientation, and with the ability to handle dynamic gaps reactively. The method extends to sequences of gaps and varied geometries.
What carries the argument
End-to-end sensorimotor policies that map onboard vision and proprioception directly to low-level control commands, trained with RL and initialized using trajectories from a model-based planner.
If this is right
- The policy achieves high repeatability in real-world gap traversals with low clearance and high tilt.
- It enables reactive servo control for moving gaps without training on dynamic scenarios.
- Policies can be developed for challenging tracks consisting of multiple narrow gaps placed closely together.
- The approach works for geometrically diverse gaps without requiring manually defined traversal poses or visual features.
Where Pith is reading between the lines
- If the sim-to-real transfer holds, this method could be applied to other aggressive aerial tasks such as rapid obstacle avoidance in unknown environments.
- The initialization strategy using model-based plans may help overcome exploration challenges in other robotic RL applications with constrained action spaces.
Load-bearing premise
The simulation accurately enough represents real aerodynamics, sensor noise, and vehicle dynamics that the learned policy transfers directly to the physical quadrotor without major domain randomization or fine-tuning.
What would settle it
Testing the policy on a real quadrotor attempting to fly through a rectangular gap with 5 cm clearance at 90-degree tilt; success with high repeatability would support the claim, while frequent collisions or failures would indicate simulation-reality gap.
read the original abstract
Precise aggressive maneuvers with lightweight onboard sensors remains a key bottleneck in fully exploiting the maneuverability of drones. Such maneuvers are critical for expanding the systems' accessible area by navigating through narrow openings in the environment. Among the most relevant problems, a representative one is aggressive traversal through narrow gaps with quadrotors under SE(3) constraints, which require the quadrotors to leverage a momentary tilted attitude and the asymmetry of the airframe to navigate through gaps. In this paper, we achieve such maneuvers by developing sensorimotor policies directly mapping onboard vision and proprioception into low-level control commands. The policies are trained using reinforcement learning (RL) with end-to-end policy distillation in simulation. We mitigate the fundamental hardness of model-free RL's exploration on the restricted solution space with an initialization strategy leveraging trajectories generated by a model-based planner. Careful sim-to-real design allows the policy to control a quadrotor through narrow gaps with low clearances and high repeatability. For instance, the proposed method enables a quadrotor to navigate a rectangular gap at a 5 cm clearance, tilted at up to 90-degree orientation, without knowledge of the gap's position or orientation. Without training on dynamic gaps, the policy can reactively servo the quadrotor to traverse through a moving gap. The proposed method is also validated by training and deploying policies on challenging tracks of narrow gaps placed closely. The flexibility of the policy learning method is demonstrated by developing policies for geometrically diverse gaps, without relying on manually defined traversal poses and visual features.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to develop sensorimotor policies for quadrotors that directly map onboard vision and proprioception to low-level control commands, enabling precise aggressive traversals through narrow rectangular gaps at 5 cm clearance and up to 90° tilt without explicit knowledge of gap position or orientation. Policies are trained via RL with end-to-end distillation from model-based planner trajectories in simulation, transferred to hardware through careful sim-to-real design, and demonstrated to reactively handle moving gaps as well as sequences of diverse gaps.
Significance. If the empirical claims hold under rigorous validation, the work would represent a meaningful advance in autonomous aerial robotics by showing that learned policies can achieve repeatable, high-precision SE(3) maneuvers on physical hardware without pose estimation or hand-crafted features. The model-based initialization strategy to address RL exploration hardness and the generalization to unseen dynamic gaps are notable strengths that could inform future sim-to-real efforts for underactuated systems.
major comments (2)
- [Abstract and §4] Abstract and §4 (Experimental Results): The abstract and results claim 'high repeatability' for 5 cm clearance and 90° tilt traversals on hardware, yet report no quantitative success rates, trial counts, failure modes, error bars, or ablation studies on sim-to-real factors. This is load-bearing for the central claim of direct policy transfer and real-world viability.
- [§3] §3 (Method, sim-to-real design): The 'careful sim-to-real design' invoked to justify zero-shot hardware deployment does not quantify or mitigate mismatches in aerodynamics (blade flapping, asymmetric downwash, gap-induced ground effect) during 90°-tilt aggressive flight, which directly risks the reported repeatability even with perfect vision/proprioception.
minor comments (2)
- [Abstract] Abstract: The phrasing 'Among the most relevant problems, a representative one is aggressive traversal...' is somewhat awkward and could be tightened for readability.
- [Notation and §2] Throughout: Some notation for SE(3) constraints and policy inputs could be clarified with an explicit diagram or table of sensor modalities.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and for recognizing the potential significance of our sensorimotor policy approach for aggressive aerial maneuvers. We address each major comment below with clarifications and planned revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract and §4] Abstract and §4 (Experimental Results): The abstract and results claim 'high repeatability' for 5 cm clearance and 90° tilt traversals on hardware, yet report no quantitative success rates, trial counts, failure modes, error bars, or ablation studies on sim-to-real factors. This is load-bearing for the central claim of direct policy transfer and real-world viability.
Authors: We agree that explicit quantitative metrics would better support the repeatability claims. The full manuscript reports multiple hardware trials across different gap configurations and orientations, with consistent successful traversals shown in the accompanying videos and described in §4. To address this directly, we will revise the abstract and §4 to include specific success rates (e.g., successful traversals out of total attempts), trial counts, observed failure modes, and basic statistics such as position error distributions. We will also incorporate a concise ablation on key sim-to-real factors like domain randomization ranges. These additions will provide the rigorous validation requested without altering the core empirical findings. revision: yes
-
Referee: [§3] §3 (Method, sim-to-real design): The 'careful sim-to-real design' invoked to justify zero-shot hardware deployment does not quantify or mitigate mismatches in aerodynamics (blade flapping, asymmetric downwash, gap-induced ground effect) during 90°-tilt aggressive flight, which directly risks the reported repeatability even with perfect vision/proprioception.
Authors: This is a valid observation regarding the level of detail in our sim-to-real transfer discussion. Section 3 describes our use of domain randomization over dynamics parameters (including thrust curves and delays) and sensor noise to enable zero-shot deployment, which proved sufficient for the reported hardware results. However, we did not provide explicit quantification or targeted mitigation for effects such as blade flapping or gap-induced ground effects at extreme tilts. In the revision, we will expand §3 to include a discussion of these aerodynamic considerations, referencing our simulation parameters and post-hoc hardware observations that informed the randomization strategy. This will better justify why the policy maintained repeatability despite potential mismatches. revision: yes
Circularity Check
No circularity: empirical RL training and sim-to-real deployment
full rationale
The paper's claims rest on training sensorimotor policies via RL in simulation (with model-based initialization and end-to-end distillation) followed by hardware deployment under 'careful sim-to-real design.' These are standard empirical steps whose success is measured by repeatable real-world traversal results, not by any equation or parameter that reduces to its own inputs by construction. No self-citations, uniqueness theorems, or fitted quantities are presented as independent predictions. The derivation chain is self-contained through conventional RL practices and experimental validation.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Simulation dynamics and sensor models are sufficiently accurate for zero-shot policy transfer to real hardware
- domain assumption Model-based planner trajectories lie close enough to feasible RL solutions to bootstrap exploration
Reference graph
Works this paper leans on
-
[1]
Trajectory generation and control for precise aggressive maneuvers with quadrotors.The International Journal of Robotics Research, 31(5):664–674, 2012
Daniel Mellinger, Nathan Michael, and Vijay Kumar. Trajectory generation and control for precise aggressive maneuvers with quadrotors.The International Journal of Robotics Research, 31(5):664–674, 2012
2012
-
[2]
Polynomial trajectory planning for aggressive quadrotor flight in dense indoor environments
Charles Richter, Adam Bry, and Nicholas Roy. Polynomial trajectory planning for aggressive quadrotor flight in dense indoor environments. InRobotics Research: The 16th International Symposium ISRR, pages 649–666. Springer, 2013
2013
-
[3]
Multi-fidelity black-box optimization for time-optimal quadrotor maneuvers.The International Journal of Robotics Research, 40(12- 14):1352–1369, 2021
Gilhyun Ryou, Ezra Tal, and Sertac Karaman. Multi-fidelity black-box optimization for time-optimal quadrotor maneuvers.The International Journal of Robotics Research, 40(12- 14):1352–1369, 2021
2021
-
[4]
Time-optimal planning for quadrotor waypoint flight.Science robotics, 6(56):eabh1221, 2021
Philipp Foehn, Angel Romero, and Davide Scaramuzza. Time-optimal planning for quadrotor waypoint flight.Science robotics, 6(56):eabh1221, 2021
2021
-
[5]
Champion-level drone racing using deep reinforcement learning
Elia Kaufmann, Leonard Bauersfeld, Antonio Loquercio, Matthias M¨ uller, Vladlen Koltun, and Davide Scaramuzza. Champion-level drone racing using deep reinforcement learning. Nature, 620(7976):982–987, 2023
2023
-
[6]
Safety-assured high-speed navigation for mavs.Science Robotics, 10(98):eado6187, 2025
Yunfan Ren, Fangcheng Zhu, Guozheng Lu, Yixi Cai, Longji Yin, Fanze Kong, Jiarong Lin, Nan Chen, and Fu Zhang. Safety-assured high-speed navigation for mavs.Science Robotics, 10(98):eado6187, 2025. 37
2025
-
[7]
Minding the gap: in-flight body awareness in birds.Frontiers in Zoology, 11(1):64, 2014
Ingo Schiffner, Hong Duc Vo, Prasanna S Bhagavatula, and Mandyam V Srinivasan. Minding the gap: in-flight body awareness in birds.Frontiers in Zoology, 11(1):64, 2014
2014
-
[8]
Flying through gaps: how does a bird deal with the problem and what costs are there?Royal Society Open Science, 9(3):211072, 2022
Per Henningsson. Flying through gaps: how does a bird deal with the problem and what costs are there?Royal Society Open Science, 9(3):211072, 2022
2022
-
[9]
Side- ways maneuvers enable narrow aperture negotiation by free-flying hummingbirds.Journal of Experimental Biology, 226(21):jeb245643, 2023
Marc A Badger, Kathryn McClain, Ashley Smiley, Jessica Ye, and Robert Dudley. Side- ways maneuvers enable narrow aperture negotiation by free-flying hummingbirds.Journal of Experimental Biology, 226(21):jeb245643, 2023
2023
-
[10]
Nest predation and nest sites: new perspectives on old patterns.Bioscience, 43(8):523–532, 1993
Thomas E Martin. Nest predation and nest sites: new perspectives on old patterns.Bioscience, 43(8):523–532, 1993
1993
-
[11]
Regional forest fragmentation and the nesting success of migratory birds.Science, 267(5206):1987–1990, 1995
Scott K Robinson, Frank R Thompson III, Therese M Donovan, Donald R Whitehead, and John Faaborg. Regional forest fragmentation and the nesting success of migratory birds.Science, 267(5206):1987–1990, 1995
1987
-
[12]
Geometrically constrained trajectory opti- mization for multicopters.IEEE Transactions on Robotics, 38(5):3259–3278, 2022
Zhepei Wang, Xin Zhou, Chao Xu, and Fei Gao. Geometrically constrained trajectory opti- mization for multicopters.IEEE Transactions on Robotics, 38(5):3259–3278, 2022
2022
-
[13]
Estimation, control, and planning for aggressive flight with a small quadrotor with a single camera and imu.IEEE Robotics and Automation Letters, 2(2):404–411, 2016
Giuseppe Loianno, Chris Brunner, Gary McGrath, and Vijay Kumar. Estimation, control, and planning for aggressive flight with a small quadrotor with a single camera and imu.IEEE Robotics and Automation Letters, 2(2):404–411, 2016
2016
-
[14]
Aggressive quadrotor flight through narrow gaps with onboard sensing and computing using active vision
Davide Falanga, Elias Mueggler, Matthias Faessler, and Davide Scaramuzza. Aggressive quadrotor flight through narrow gaps with onboard sensing and computing using active vision. In2017 IEEE international conference on robotics and automation (ICRA), pages 5774–5781. IEEE, 2017
2017
-
[15]
Flying through a narrow gap using neural network: an end-to-end planning and control approach
Jiarong Lin, Luqi Wang, Fei Gao, Shaojie Shen, and Fu Zhang. Flying through a narrow gap using neural network: an end-to-end planning and control approach. In2019 IEEE/RSJ international conference on intelligent robots and systems (IROS), pages 3526–3533. IEEE, 2019. 38
2019
-
[16]
Chenxi Xiao, Peng Lu, and Qizhi He. Flying through a narrow gap using end-to-end deep reinforcement learning augmented with curriculum learning and sim2real.IEEE transactions on neural networks and learning systems, 34(5):2701–2708, 2021
2021
-
[17]
Robust visual inertial odometry using a direct ekf-based approach
Michael Bloesch, Sammy Omari, Marco Hutter, and Roland Siegwart. Robust visual inertial odometry using a direct ekf-based approach. In2015 IEEE/RSJ international conference on intelligent robots and systems (IROS), pages 298–304. IEEE, 2015
2015
-
[18]
Keyframe-based visual–inertial odometry using nonlinear optimization.The International Journal of Robotics Research, 34(3):314–334, 2015
Stefan Leutenegger, Simon Lynen, Michael Bosse, Roland Siegwart, and Paul Furgale. Keyframe-based visual–inertial odometry using nonlinear optimization.The International Journal of Robotics Research, 34(3):314–334, 2015
2015
-
[19]
Estimating uncertain spatial relationships in robotics
Randall Smith, Matthew Self, and Peter Cheeseman. Estimating uncertain spatial relationships in robotics. InAutonomous robot vehicles, pages 167–193. Springer, 1990
1990
-
[20]
Reaching the limit in autonomous racing: Optimal control versus reinforcement learning
Yunlong Song, Angel Romero, Matthias M¨ uller, Vladlen Koltun, and Davide Scaramuzza. Reaching the limit in autonomous racing: Optimal control versus reinforcement learning. Science Robotics, 8(82):eadg1462, 2023
2023
-
[21]
Online whole-body motion planning for quadrotor using multi-resolution search
Yunfan Ren, Siqi Liang, Fangcheng Zhu, Guozheng Lu, and Fu Zhang. Online whole-body motion planning for quadrotor using multi-resolution search. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 1594–1600. IEEE, 2023
2023
-
[22]
Planning-oriented autonomous driving
Yihan Hu, Jiazhi Yang, Li Chen, Keyu Li, Chonghao Sima, Xizhou Zhu, Siqi Chai, Senyao Du, Tianwei Lin, Wenhai Wang, et al. Planning-oriented autonomous driving. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 17853–17862, 2023
2023
-
[23]
End-to-end training of deep visuomotor policies.Journal of Machine Learning Research, 17(39):1–40, 2016
Sergey Levine, Chelsea Finn, Trevor Darrell, and Pieter Abbeel. End-to-end training of deep visuomotor policies.Journal of Machine Learning Research, 17(39):1–40, 2016
2016
-
[24]
Legged locomotion in challenging terrains using egocentric vision
Ananye Agarwal, Ashish Kumar, Jitendra Malik, and Deepak Pathak. Legged locomotion in challenging terrains using egocentric vision. InConference on robot learning, pages 403–415. PMLR, 2023. 39
2023
-
[25]
Demonstrating agile flight from pixels without state estimation.Robotics: Science and Systems (RSS), 2024
Ismail Geles, Leonard Bauersfeld, Angel Romero, Jiaxu Xing, and Davide Scaramuzza. Demonstrating agile flight from pixels without state estimation.Robotics: Science and Systems (RSS), 2024
2024
-
[26]
MIT press Cambridge, 1998
Richard S Sutton, Andrew G Barto, et al.Reinforcement learning: An introduction, volume 1. MIT press Cambridge, 1998
1998
-
[27]
Princeton University, 1954
Marvin Lee Minsky.Theory of neural-analog reinforcement systems and its application to the brain-model problem. Princeton University, 1954
1954
-
[28]
Andrei A Rusu, Sergio Gomez Colmenarejo, Caglar Gulcehre, Guillaume Desjardins, James Kirkpatrick, Razvan Pascanu, Volodymyr Mnih, Koray Kavukcuoglu, and Raia Hadsell. Policy distillation.arXiv preprint arXiv:1511.06295, 2015
work page Pith review arXiv 2015
-
[29]
A memory of errors in sensorimotor learning.Science, 345(6202):1349–1353, 2014
David J Herzfeld, Pavan A Vaswani, Mollie K Marko, and Reza Shadmehr. A memory of errors in sensorimotor learning.Science, 345(6202):1349–1353, 2014
2014
-
[30]
Rep- resentation learning in the artificial and biological neural networks underlying sensorimotor integration.Science Advances, 8(22):eabn0984, 2022
Ahmad Suhaimi, Amos WH Lim, Xin Wei Chia, Chunyue Li, and Hiroshi Makino. Rep- resentation learning in the artificial and biological neural networks underlying sensorimotor integration.Science Advances, 8(22):eabn0984, 2022
2022
-
[31]
Learning by cheating
Dian Chen, Brady Zhou, Vladlen Koltun, and Philipp Kr ¨ahenb¨ uhl. Learning by cheating. In Conference on robot learning, pages 66–75. PMLR, 2020
2020
-
[32]
A reduction of imitation learning and structured prediction to no-regret online learning
St ´ephane Ross, Geoffrey Gordon, and Drew Bagnell. A reduction of imitation learning and structured prediction to no-regret online learning. InProceedings of the fourteenth interna- tional conference on artificial intelligence and statistics, pages 627–635. JMLR Workshop and Conference Proceedings, 2011
2011
-
[33]
Minimum snap trajectory generation and control for quadrotors
Daniel Mellinger and Vijay Kumar. Minimum snap trajectory generation and control for quadrotors. In2011 IEEE international conference on robotics and automation, pages 2520–
-
[34]
Neural-fly enables rapid learning for agile flight in strong winds.Science Robotics, 7(66):eabm6597, 2022
Michael O’Connell, Guanya Shi, Xichen Shi, Kamyar Azizzadenesheli, Anima Anandkumar, Yisong Yue, and Soon-Jo Chung. Neural-fly enables rapid learning for agile flight in strong winds.Science Robotics, 7(66):eabm6597, 2022. 40
2022
-
[35]
Neurobem: Hybrid aerodynamic quadrotor model.arXiv preprint arXiv:2106.08015, 2021
Leonard Bauersfeld, Elia Kaufmann, Philipp Foehn, Sihao Sun, and Davide Scaramuzza. Neurobem: Hybrid aerodynamic quadrotor model.arXiv preprint arXiv:2106.08015, 2021
-
[36]
Px4 autopilot.https://px4.io/, 2023
PX4 Development Team. Px4 autopilot.https://px4.io/, 2023. Accessed: 2024-01-20
2023
-
[37]
Robotics and Edge AI, NVIDIA Jetson Orin
NVIDIA. Robotics and Edge AI, NVIDIA Jetson Orin. https://www.nvidia.com/enus/autonomous-machines/embedded-systems/jetson-orin/, 2025
2025
-
[38]
Sim-to-real transfer of robotic control with dynamics randomization
Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, and Pieter Abbeel. Sim-to-real transfer of robotic control with dynamics randomization. In2018 IEEE international conference on robotics and automation (ICRA), pages 3803–3810. IEEE, 2018
2018
-
[39]
Learning robust perceptive locomotion for quadrupedal robots in the wild.Science robotics, 7(62):eabk2822, 2022
Takahiro Miki, Joonho Lee, Jemin Hwangbo, Lorenz Wellhausen, Vladlen Koltun, and Marco Hutter. Learning robust perceptive locomotion for quadrupedal robots in the wild.Science robotics, 7(62):eabk2822, 2022
2022
-
[40]
Blender - a 3d modelling and rendering package, 2023
Blender Online Community. Blender - a 3d modelling and rendering package, 2023. Version 3.6
2023
-
[41]
Pose estimation from corresponding point data.IEEE Transactions on Systems, Man, and Cybernetics, 19(6):1426–1446, 1989
Robert M Haralick, Hyonam Joo, Chung-Nan Lee, Xinhua Zhuang, Vinay G Vaidya, and Man Bae Kim. Pose estimation from corresponding point data.IEEE Transactions on Systems, Man, and Cybernetics, 19(6):1426–1446, 1989
1989
-
[42]
Searching for mobilenetv3
Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, et al. Searching for mobilenetv3. InProceedings of the IEEE/CVF international conference on computer vision, pages 1314– 1324, 2019
2019
-
[43]
Encoder-decoder with atrous separable convolution for semantic image segmentation
Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), pages 801–818, 2018
2018
-
[44]
Proximal Policy Optimization Algorithms
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms.arXiv preprint arXiv:1707.06347, 2017. 41
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[45]
A benchmark comparison of learned control policies for agile quadrotor flight
Elia Kaufmann, Leonard Bauersfeld, and Davide Scaramuzza. A benchmark comparison of learned control policies for agile quadrotor flight. In2022 International Conference on Robotics and Automation (ICRA), pages 10504–10510. IEEE, 2022
2022
-
[46]
A computationally efficient motion primitive for quadrocopter trajectory generation.IEEE Transactions on Robotics, 31(6):1294– 1310, 2015
Mark W Mueller, Markus Hehn, and Raffaello D’ Andrea. A computationally efficient motion primitive for quadrocopter trajectory generation.IEEE Transactions on Robotics, 31(6):1294– 1310, 2015
2015
-
[47]
Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot.Autonomous robots, 40:429–455, 2016
Scott Kuindersma, Robin Deits, Maurice Fallon, Andr ´es Valenzuela, Hongkai Dai, Frank Permenter, Twan Koolen, Pat Marion, and Russ Tedrake. Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot.Autonomous robots, 40:429–455, 2016
2016
-
[48]
Swarm of micro flying robots in the wild
Xin Zhou, Xiangyong Wen, Zhepei Wang, Yuman Gao, Haojia Li, Qianhao Wang, Tiankai Yang, Haojian Lu, Yanjun Cao, Chao Xu, et al. Swarm of micro flying robots in the wild. Science Robotics, 7(66):eabm5954, 2022
2022
-
[49]
Flipping the script with atlas, 2020
Boston Dynamics. Flipping the script with atlas, 2020. https://bostondynamics.com/blog/flipping-the-script-with-atlas/
2020
-
[50]
Mobileye under the hood, 2022
Mobileye. Mobileye under the hood, 2022. https://www.mobileye.com/ces-2022/
2022
-
[51]
Segment anything
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, et al. Segment anything. In Proceedings of the IEEE/CVF international conference on computer vision, pages 4015–4026, 2023
2023
-
[52]
Back to newton’s laws: Learning vision-based agile flight via differentiable physics
Yuang Zhang, Yu Hu, Yunlong Song, Danping Zou, and Weiyao Lin. Back to newton’s laws: Learning vision-based agile flight via differentiable physics.arXiv preprint arXiv:2407.10648, 2024
-
[53]
Flying on point clouds with reinforcement learning.arXiv preprint arXiv:2503.00496, 2025
Guangtong Xu, Tianyue Wu, Zihan Wang, Qianhao Wang, and Fei Gao. Flying on point clouds with reinforcement learning.arXiv preprint arXiv:2503.00496, 2025. 42
-
[54]
Computing large convex regions of obstacle-free space through semidefinite programming
Robin Deits and Russ Tedrake. Computing large convex regions of obstacle-free space through semidefinite programming. InAlgorithmic Foundations of Robotics XI: Selected Contributions of the Eleventh International Workshop on the Algorithmic Foundations of Robotics, pages 109–124. Springer, 2015
2015
-
[55]
Fast iterative region inflation for computing large 2-d/3-d convex regions of obstacle-free space.IEEE Transactions on Robotics, 2025
Qianhao Wang, Zhepei Wang, Mingyang Wang, Jialin Ji, Zhichao Han, Tianyue Wu, Rui Jin, Yuman Gao, Chao Xu, and Fei Gao. Fast iterative region inflation for computing large 2-d/3-d convex regions of obstacle-free space.IEEE Transactions on Robotics, 2025
2025
-
[56]
Simple, a visuotactile method learned in simulation to precisely pick, localize, regrasp, and place objects.Science Robotics, 9(91):eadi8808, 2024
Maria Bauza, Antonia Bronars, Yifan Hou, Ian Taylor, Nikhil Chavan-Dafle, and Alberto Rodriguez. Simple, a visuotactile method learned in simulation to precisely pick, localize, regrasp, and place objects.Science Robotics, 9(91):eadi8808, 2024
2024
-
[57]
Agile autonomous driving using end-to-end deep imitation learning.Robotics: Science and Systems (RSS), 2017
Yunpeng Pan, Ching-An Cheng, Kamil Saigol, Keuntaek Lee, Xinyan Yan, Evangelos Theodorou, and Byron Boots. Agile autonomous driving using end-to-end deep imitation learning.Robotics: Science and Systems (RSS), 2017
2017
-
[58]
Serl: A software suite for sample- efficient robotic reinforcement learning
Jianlan Luo, Zheyuan Hu, Charles Xu, You Liang Tan, Jacob Berg, Archit Sharma, Stefan Schaal, Chelsea Finn, Abhishek Gupta, and Sergey Levine. Serl: A software suite for sample- efficient robotic reinforcement learning. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 16961–16969. IEEE, 2024
2024
-
[59]
A direct visual servoing-based framework for the 2016 IROS autonomous drone racing challenge
Sunggoo Jung, Sungwook Cho, Dasol Lee, Hanseob Lee, and David Hyunchul Shim. A direct visual servoing-based framework for the 2016 IROS autonomous drone racing challenge. Journal of Field Robotics, 35(1):146–166, 2018
2016
-
[60]
Sim-to-Real: Learning Agile Locomotion For Quadruped Robots
Jie Tan, Tingnan Zhang, Erwin Coumans, Atil Iscen, Yunfei Bai, Danijar Hafner, Steven Bohez, and Vincent Vanhoucke. Sim-to-real: Learning agile locomotion for quadruped robots.arXiv preprint arXiv:1804.10332, 2018
work page Pith review arXiv 2018
-
[61]
Policy invariance under reward transforma- tions: Theory and application to reward shaping
Andrew Y Ng, Daishi Harada, and Stuart Russell. Policy invariance under reward transforma- tions: Theory and application to reward shaping. InIcml, volume 99, pages 278–287. Citeseer, 1999. 43
1999
-
[62]
Pointnet: Deep learning on point sets for 3d classification and segmentation
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660, 2017
2017
-
[63]
Isaac gym: High performance gpu-based physics simulation for robot learning.Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2021
Viktor Makoviychuk, Lukasz Wawrzyniak, Yunrong Guo, Michelle Lu, Kier Storey, Miles Macklin, David Hoeller, Nikita Rudin, Arthur Allshire, Ankur Handa, et al. Isaac gym: High performance gpu-based physics simulation for robot learning.Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2021
2021
-
[64]
Adaptive mobile manipulation for articulated objects in the open world,
Haoyu Xiong, Russell Mendonca, Kenneth Shaw, and Deepak Pathak. Adaptive mobile ma- nipulation for articulated objects in the open world.arXiv preprint arXiv:2401.14403, 2024
-
[65]
Vi- sual whole-body control for legged loco-manipulation,
Minghuan Liu, Zixuan Chen, Xuxin Cheng, Yandong Ji, Ri-Zhao Qiu, Ruihan Yang, and Xiaolong Wang. Visual whole-body control for legged loco-manipulation.arXiv preprint arXiv:2403.16967, 2024
-
[66]
Reinforcement learning for versatile, dynamic, and robust bipedal locomotion control.The International Journal of Robotics Research, 44(5):840–888, 2025
Zhongyu Li, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, and Koushil Sreenath. Reinforcement learning for versatile, dynamic, and robust bipedal locomotion control.The International Journal of Robotics Research, 44(5):840–888, 2025
2025
-
[67]
Robot model identification and learning: A modern perspective.Annual Review of Control, Robotics, and Autonomous Systems, 7, 2024
Taeyoon Lee, Jaewoon Kwon, Patrick M Wensing, and Frank C Park. Robot model identification and learning: A modern perspective.Annual Review of Control, Robotics, and Autonomous Systems, 7, 2024
2024
-
[68]
Learn- ing dexterous in-hand manipulation.The International Journal of Robotics Research, 39(1):3– 20, 2020
OpenAI: Marcin Andrychowicz, Bowen Baker, Maciek Chociej, Rafal Jozefowicz, Bob Mc- Grew, Jakub Pachocki, Arthur Petron, Matthias Plappert, Glenn Powell, Alex Ray, et al. Learn- ing dexterous in-hand manipulation.The International Journal of Robotics Research, 39(1):3– 20, 2020
2020
-
[69]
Vimo: Simultaneous visual inertial model-based odometry and force estimation.IEEE Robotics and Automation Letters, 4(3):2785–2792, 2019
Barza Nisar, Philipp Foehn, Davide Falanga, and Davide Scaramuzza. Vimo: Simultaneous visual inertial model-based odometry and force estimation.IEEE Robotics and Automation Letters, 4(3):2785–2792, 2019. 44
2019
-
[70]
Vid-fusion: Robust visual- inertial-dynamics odometry for accurate external force estimation
Ziming Ding, Tiankai Yang, Kunyi Zhang, Chao Xu, and Fei Gao. Vid-fusion: Robust visual- inertial-dynamics odometry for accurate external force estimation. In2021 IEEE International Conference on Robotics and Automation (ICRA), pages 14469–14475. IEEE, 2021
2021
-
[71]
Flightmare: A flexible quadrotor simulator
Yunlong Song, Selim Naji, Elia Kaufmann, Antonio Loquercio, and Davide Scaramuzza. Flightmare: A flexible quadrotor simulator. InConference on Robot Learning, pages 1147–
-
[72]
Zhepei Wang, Xin Zhou, Chao Xu, and Fei Gao. Geometrically constrained trajectory opti- mization for multicopters.https://arxiv.org/pdf/2103.00190v1, 2021
-
[73]
A robust and modular multi-sensor fusion approach applied to mav navigation
Simon Lynen, Markus W Achtelik, Stephan Weiss, Margarita Chli, and Roland Siegwart. A robust and modular multi-sensor fusion approach applied to mav navigation. In2013 IEEE/RSJ international conference on intelligent robots and systems, pages 3923–3929. IEEE, 2013
2013
-
[74]
Unidexgrasp++: Improving dexterous grasping policy learning via geometry-aware curriculum and iterative generalist-specialist learning
Weikang Wan, Haoran Geng, Yun Liu, Zikang Shan, Yaodong Yang, Li Yi, and He Wang. Unidexgrasp++: Improving dexterous grasping policy learning via geometry-aware curriculum and iterative generalist-specialist learning. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 3891–3902, 2023
2023
-
[75]
Logan Engstrom, Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Firdaus Janoos, Larry Rudolph, and Aleksander Madry. Implementation matters in deep policy gradients: A case study on ppo and trpo.arXiv preprint arXiv:2005.12729, 2020. 45 Acknowledgments We express our gratitude to Weijie Kong, Jiarui Zhang, Rui Jin, and Yuman Gao for their invaluable ph...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.