arxiv: 2602.09368 · v2 · submitted 2026-02-10 · 💻 cs.RO

Recognition: 2 theorem links

· Lean Theorem

Certified Gradient-Based Contact-Rich Manipulation via Smoothing-Error Reachable Tubes

Wei-Chen Li , Glen Chou

Authors on Pith no claims yet

Pith reviewed 2026-05-16 06:02 UTC · model grok-4.3

classification 💻 cs.RO

keywords contact-rich manipulationdifferentiable simulationreachable setsrobust controlhybrid dynamicsaffine feedbacksmoothinggradient-based optimization

0 comments

The pith

Smoothing contact dynamics allows gradient-based optimization of affine feedback policies that guarantee robust constraint satisfaction on the original nonsmooth hybrid system through analytical reachable tubes for the smoothing error.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Contact-rich manipulation tasks suffer from discontinuous gradients in hybrid dynamics, making gradient-based controller optimization difficult. Smoothing the dynamics restores informative gradients but creates a mismatch that risks controller failure on real systems. The paper addresses this by characterizing the deviation as a set-valued discrepancy and incorporating it into policy optimization using analytical reachable sets. This produces time-varying affine feedback policies that ensure constraint satisfaction for the closed-loop system under the true dynamics. The approach is demonstrated on planar pushing, object rotation, and in-hand manipulation, with better safety and accuracy than baselines.

Core claim

The method plans with smoothed contact dynamics and geometry in a convex optimization-based differentiable simulator, represents the induced deviation from nonsmooth dynamics as a set-valued discrepancy, and incorporates this discrepancy into the optimization of time-varying affine feedback policies through analytical reachable sets, enabling robust constraint satisfaction for the closed-loop hybrid system while relying solely on the informative gradients of the smoothed model.

What carries the argument

Analytical reachable tubes of the smoothing-error discrepancy under time-varying affine feedback policies, which bound the possible trajectories of the original system.

If this is right

Produces controllers that respect the unilateral nature of contact constraints.
Certifies robust performance for closed-loop hybrid systems in contact-rich tasks.
Reduces safety violations and goal errors compared to baseline methods in pushing, rotation, and dexterous manipulation.
Relies only on gradients from the smoothed model for optimization.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method could be applied to other domains with hybrid dynamics where smoothing is used for differentiability.
Future work might extend the analytical reachable set computation to more complex contact geometries or stochastic disturbances.
By separating the smoothing for gradients from the error bounding for guarantees, this decouples efficiency from conservatism in robotic planning.

Load-bearing premise

The mismatch between smoothed and nonsmooth dynamics can be bounded by a computable set-valued discrepancy for which reachable tubes under affine policies can be derived analytically.

What would settle it

A real-world experiment in which a trajectory of the hybrid system under the optimized policy violates a constraint outside the predicted reachable tube, or where the tube is too conservative to be useful.

Figures

Figures reproduced from arXiv: 2602.09368 by Glen Chou, Wei-Chen Li.

**Figure 2.** Figure 2: 1D pusher. (a) System schematic. (b) Discrete-time dynamics + [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Linearization of the smoothed dynamics z + = fκ(z, v) for a 1D pusher as viewed by (a) TO-CTR and (b) our method. t xo Nominal trajectory True dynamics rollout (a) t xo Nominal trajectory True dynamics feedback control Predicted tube (b) [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Rollouts of the 1D pusher system for pushing the object to reach [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Rollout of bimanual planar bucket manipulation. (a) Time-lapse. [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Rollout of bimanual planar box manipulation. Keyframes under [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 8.** Figure 8: Example rollouts of bimanual planar bucket manipulation illus [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗

**Figure 9.** Figure 9: Example rollouts of bimanual planar bucket manipulation illustrat [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗

read the original abstract

Gradient-based methods can efficiently optimize controllers by leveraging differentiable simulation and physical priors. However, contact-rich manipulation remains challenging because hybrid contact dynamics often produce discontinuous or vanishing gradients. Although smoothing the dynamics can restore informative gradients, the resulting model mismatch can cause controller failures when deployed on real systems. We address this trade-off by planning with smoothed dynamics while explicitly quantifying and compensating for the induced error, providing formal guarantees on safety and task completion under the original nonsmooth dynamics. Our approach applies smoothing to both contact dynamics and contact geometry within a differentiable simulator based on convex optimization, allowing us to characterize the deviation from the nonsmooth dynamics as a set-valued discrepancy. We incorporate this discrepancy into the optimization of time-varying affine feedback policies through analytical reachable sets, enabling robust constraint satisfaction for the closed-loop hybrid system while relying solely on the informative gradients of the smoothed model. By bridging differentiable simulation with set-valued robust control, our method produces affine feedback policies that respect the unilateral nature of contact. We evaluate our method on several contact-rich tasks, including planar pushing, object rotation, and in-hand dexterous manipulation, achieving certified constraint satisfaction with lower safety violations and smaller goal errors than baseline approaches.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces smoothing-error reachable tubes to certify policies optimized on smoothed contact dynamics for the original nonsmooth hybrid system.

read the letter

The paper's central idea is to smooth contact dynamics for gradient-based optimization but certify the resulting policies on the original nonsmooth system using reachable tubes that account for the smoothing error. This is the part that stands out. What is new is the specific construction of these smoothing-error reachable tubes. They characterize the deviation between smoothed and nonsmooth models as a set-valued discrepancy and derive analytical reachable sets under time-varying affine feedback policies. This lets them optimize using only the smoothed gradients while ensuring robust constraint satisfaction for the hybrid system. The work does well in bridging differentiable simulation with robust control ideas. It handles both contact dynamics and geometry smoothing in a convex simulator, and the experiments across pushing, rotation, and dexterous manipulation tasks report certified performance with reduced safety violations compared to baselines. The soft spots are around the tightness and validity of those analytical tubes. The stress-test concern about overapproximating mode switches in hybrid systems is worth watching—if the discrepancy doesn't fully capture velocity jumps or unilateral constraints across switches, the guarantees could weaken. The abstract claims formal guarantees, but without seeing the derivations, it's unclear how conservative or accurate the tubes are. That said, there's no sign of circularity in the approach. This paper is for researchers in robotics control who deal with contact-rich manipulation and want certified methods. It would give value to anyone trying to make gradient-based methods reliable for real deployment. I recommend sending it to peer review. The idea is substantive and the results look promising enough to justify detailed referee input on the proofs and experiments.

Referee Report

2 major / 2 minor

Summary. The paper claims to enable certified gradient-based optimization of time-varying affine feedback policies for contact-rich manipulation by smoothing both contact dynamics and geometry in a convex-optimization-based differentiable simulator. The smoothing error is characterized as a set-valued discrepancy, which is then incorporated into policy optimization via analytical reachable sets. This yields formal guarantees of robust constraint satisfaction for the original nonsmooth hybrid system while using only the informative gradients from the smoothed model. The approach is evaluated on planar pushing, object rotation, and in-hand dexterous manipulation tasks, reporting lower safety violations and smaller goal errors than baselines.

Significance. If the analytical reachable-tube construction is shown to correctly overapproximate trajectories of the true hybrid system (including mode switches and velocity jumps), the work provides a valuable bridge between differentiable simulation and set-valued robust control. It allows safety-certified policies to be synthesized using only smoothed gradients, addressing a key limitation in contact-rich robotics. The explicit preservation of unilateral contact constraints within the affine feedback is a concrete strength that could generalize to other hybrid systems.

major comments (2)

[§4] §4 (Reachable-set construction): the claim that the set-valued smoothing discrepancy admits closed-form reachable tubes under time-varying affine policies must explicitly address instantaneous velocity jumps at contact events; if the discrepancy set is non-convex or the affine map does not preserve invariance across switches, the tubes may fail to contain all nonsmooth trajectories, voiding the formal guarantees.
[§5.2] §5.2 (Experimental validation): the reported certified constraint satisfaction relies on the tubes being tight enough; without tabulated overapproximation ratios or explicit comparison of tube volume versus observed violation rates on the real system, it is impossible to assess whether the guarantees are meaningful or merely conservative.

minor comments (2)

[§3] Notation for the discrepancy set D(t) is introduced without a clear definition of its dependence on the smoothing parameter; a short appendix deriving its explicit form would improve reproducibility.
[Figure 4] Figure 4 (reachable-tube plots): the shaded regions lack explicit axis labels for the discrepancy bounds and do not indicate the time-varying affine policy parameters used.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments. We address each major point below and indicate the revisions planned for the next version of the manuscript.

read point-by-point responses

Referee: [§4] §4 (Reachable-set construction): the claim that the set-valued smoothing discrepancy admits closed-form reachable tubes under time-varying affine policies must explicitly address instantaneous velocity jumps at contact events; if the discrepancy set is non-convex or the affine map does not preserve invariance across switches, the tubes may fail to contain all nonsmooth trajectories, voiding the formal guarantees.

Authors: We appreciate this observation and agree that the current presentation in §4 would benefit from greater explicitness on this point. In the revision we will add a dedicated paragraph and supporting lemma showing that the set-valued discrepancy is constructed to enclose the velocity jumps at contact events (by taking the convex hull of the possible post-impact velocities consistent with the smoothing error bound). Because the feedback policy is affine, the image of this convex discrepancy set under the closed-loop map remains convex and the reachable-tube recursion preserves the over-approximation property across mode switches. The formal guarantees therefore continue to hold for the original hybrid system. We will also include a short proof sketch of the invariance step. revision: yes
Referee: [§5.2] §5.2 (Experimental validation): the reported certified constraint satisfaction relies on the tubes being tight enough; without tabulated overapproximation ratios or explicit comparison of tube volume versus observed violation rates on the real system, it is impossible to assess whether the guarantees are meaningful or merely conservative.

Authors: We agree that quantitative tightness metrics would strengthen the validation. In the revised §5.2 we will add a table reporting, for each task, the ratio of reachable-tube volume to the volume of the convex hull of observed trajectories, together with the empirical safety-violation rate inside the tube. Hardware experiments on the physical system are currently limited by sensor noise and actuation bandwidth; we will therefore present the comparison on high-fidelity simulation and explicitly discuss the gap to hardware as a limitation, with plans for future real-robot validation. These additions will allow readers to judge the practical tightness of the certificates. revision: partial

Circularity Check

0 steps flagged

No significant circularity; analytical reachable-tube derivation is independent of fitted outcomes

full rationale

The paper's core chain derives the set-valued smoothing discrepancy and its analytical reachable tubes directly from the convex-optimization simulator properties and set-valued robust control, then uses those tubes to constrain the policy optimization that only employs smoothed gradients. No equation reduces a prediction to a fitted parameter by construction, no load-bearing uniqueness theorem is imported via self-citation, and the formal guarantees are stated to hold for the original hybrid system without reference to experimental performance numbers. The derivation is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The approach rests on the domain assumption that a convex-optimization differentiable simulator can produce a well-behaved smoothed model whose deviation from nonsmooth dynamics admits an analytically tractable set-valued representation; no free parameters or new physical entities are explicitly introduced in the abstract.

axioms (1)

domain assumption Convex-optimization-based differentiable simulator accurately captures smoothed contact dynamics and geometry
Invoked to justify the existence of informative gradients and the computability of the discrepancy set.

invented entities (1)

Smoothing-error reachable tubes no independent evidence
purpose: To bound the closed-loop deviation between smoothed planning model and original nonsmooth dynamics for robust constraint satisfaction
New construct introduced to bridge differentiable simulation and set-valued robust control; no independent falsifiable evidence supplied in abstract.

pith-pipeline@v0.9.0 · 5504 in / 1430 out tokens · 56573 ms · 2026-05-16T06:02:04.908558+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We smooth both contact dynamics and geometry via a novel differentiable simulator based on convex optimization... characterize the deviation... as a set-valued discrepancy... analytical reachable sets
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 1... f0(x,u)=fκ(x,u)−P(x)−1∑Ji(x)⊤∂λκ,i/∂κ κ wi

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Over-Approximating Minimizer Sets of Constrained Convex Programs with Parametric Uncertainty via Reachability Analysis
math.OC 2026-04 unverdicted novelty 6.0

A reachability-analysis method on projected gradient descent dynamics produces certified outer approximations to the minimizer sets of strongly convex programs whose costs depend on bounded uncertain parameters.

Reference graph

Works this paper leans on

59 extracted references · 59 canonical work pages · cited by 1 Pith paper · 2 internal anchors

[1]

Differentiable convex optimization layers.Advances in Neural Informa- tion Processing Systems, 32, 2019

Akshay Agrawal, Brandon Amos, Shane Barratt, Stephen Boyd, Steven Diamond, and J Zico Kolter. Differentiable convex optimization layers.Advances in Neural Informa- tion Processing Systems, 32, 2019

work page 2019
[2]

Differentiating through a cone program.arXiv preprint arXiv:1904.09043, 2019

Akshay Agrawal, Shane Barratt, Stephen Boyd, Enzo Busseti, and Walaa M Moursi. Differentiating through a cone program.arXiv preprint arXiv:1904.09043, 2019

work page arXiv 1904
[3]

Zico Kolter

Brandon Amos and J. Zico Kolter. OptNet: Differentiable optimization as a layer in neural networks. InProceed- ings of the 34th International Conference on Machine Learning, pages 136–145, 2017

work page 2017
[4]

Doyle, Steven H

James Anderson, John C. Doyle, Steven H. Low, and Nikolai Matni. System level synthesis.Annual Reviews in Control, 47:364–393, 2019

work page 2019
[5]

Optimization-based simulation of non- smooth rigid multibody dynamics.Mathematical Pro- gramming, 105(1):113–143, 2006

Mihai Anitescu. Optimization-based simulation of non- smooth rigid multibody dynamics.Mathematical Pro- gramming, 105(1):113–143, 2006

work page 2006
[6]

Real-time multi- contact model predictive control via ADMM

Alp Aydinoglu and Michael Posa. Real-time multi- contact model predictive control via ADMM. In2022 International Conference on Robotics and Automation (ICRA), pages 3414–3421, 2022

work page 2022
[7]

Consensus complementarity control for multicontact MPC.IEEE Transactions on Robotics, 40: 3879–3896, 2024

Alp Aydinoglu, Adam Wei, Wei-Cheng Huang, and Michael Posa. Consensus complementarity control for multicontact MPC.IEEE Transactions on Robotics, 40: 3879–3896, 2024

work page 2024
[8]

John T. Betts. Survey of numerical methods for trajec- tory optimization.Journal of Guidance, Control, and Dynamics, 21(2):193–207, 1998

work page 1998
[9]

Cambridge University Press, 2004

Stephen Boyd and Lieven Vandenberghe.Convex Op- timization. Cambridge University Press, 2004. ISBN 9780521833783

work page 2004
[10]

Castro, Frank N

Alejandro M. Castro, Frank N. Permenter, and Xuchen Han. An unconstrained convex formulation of compliant contact.IEEE Transactions on Robotics, 39(2):1301– 1320, 2023

work page 2023
[11]

Safe output feedback motion planning from images via learned perception modules and contraction theory

Glen Chou, Necmiye Ozay, and Dmitry Berenson. Safe output feedback motion planning from images via learned perception modules and contraction theory. In International Workshop on the Algorithmic Foundations of Robotics, pages 349–367, 2023

work page 2023
[12]

Daniel Freeman, Erik Frey, Anton Raichuk, Sertan Girgin, Igor Mordatch, and Olivier Bachem

C. Daniel Freeman, Erik Frey, Anton Raichuk, Sertan Girgin, Igor Mordatch, and Olivier Bachem. Brax–A differentiable physics engine for large scale rigid body simulation. In35th Conference on Neural Information Processing Systems, 2021

work page 2021
[13]

Adaptive horizon actor-critic for policy learning in contact-rich differentiable simulation

Ignat Georgiev, Krishnan Srinivasan, Jie Xu, Eric Heiden, and Animesh Garg. Adaptive horizon actor-critic for policy learning in contact-rich differentiable simulation. InProceedings of the 41st International Conference on Machine Learning, 2024

work page 2024
[14]

Goulart and Yuwen Chen

Paul J. Goulart and Yuwen Chen. Clarabel: An interior- point solver for conic programs with quadratic objectives. arXiv preprint arXiv.2405.12762, 2024

work page arXiv 2024
[15]

Towards tight convex relax- ations for contact-rich manipulation

Bernhard Paus Graesdal, Shao Yuan Chew Chia, Tobia Marcucci, Savva Morozov, Alexandre Amice, Pablo A Parrilo, and Russ Tedrake. Towards tight convex relax- ations for contact-rich manipulation. InProceedings of Robotics: Science and Systems (RSS), 2024

work page 2024
[16]

The CMA Evolution Strategy: A Tutorial

Nikolaus Hansen. The CMA evolution strategy: A tutorial.arXiv preprint arXiv:1604.00772, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[17]

Reactive planar non-prehensile manipulation with hybrid model predictive control.The International Journal of Robotics Research, 39(7):755–773, 2020

Francois R Hogan and Alberto Rodriguez. Reactive planar non-prehensile manipulation with hybrid model predictive control.The International Journal of Robotics Research, 39(7):755–773, 2020

work page 2020
[18]

Predictive sampling: Real-time behaviour synthesis with mujoco,

Taylor Howell, Nimrod Gileadi, Saran Tunyasuvunakool, Kevin Zakka, Tom Erez, and Yuval Tassa. Predictive sampling: Real-time behaviour synthesis with MuJoCo. arXiv preprint arXiv:2212.00541, 2022

work page arXiv 2022
[19]

Howell, Simon Le Cleac’h, Jan Br ¨udigam, J

Taylor A. Howell, Simon Le Cleac’h, Jan Br ¨udigam, J. Zico Kolter, Mac Schwager, and Zachary Manchester. Dojo: A differentiable physics engine for robotics.arXiv preprint arXiv:2203.00806, 2022

work page arXiv 2022
[20]

Howell, Simon Le Cleac’h, Sumeet Singh, Pete Florence, Zachary Manchester, and Vikas Sindhwani

Taylor A. Howell, Simon Le Cleac’h, Sumeet Singh, Pete Florence, Zachary Manchester, and Vikas Sindhwani. Trajectory optimization with optimization-based dynam- ics.IEEE Robotics and Automation Letters, 7(3):6750– 6757, 2022

work page 2022
[21]

VP-STO: Via-point-based stochastic trajectory optimization for reactive robot behavior.arXiv preprint arXiv:2210.04067, 2022

Julius Jankowski, Lara Bruderm ¨uller, Nick Hawes, and Sylvain Calinon. VP-STO: Via-point-based stochastic trajectory optimization for reactive robot behavior.arXiv preprint arXiv:2210.04067, 2022

work page arXiv 2022
[22]

Contact-implicit model predictive control: Controlling diverse quadruped mo- tions without pre-planned contact modes or trajectories

Gijeong Kim, Dongyun Kang, Joon-Ha Kim, Seung- woo Hong, and Hae-Won Park. Contact-implicit model predictive control: Controlling diverse quadruped mo- tions without pre-planned contact modes or trajectories. The International Journal of Robotics Research, 44(3): 486–510, 2025

work page 2025
[23]

Planning with learned dynamics: Probabilis- tic guarantees on safety and reachability via Lipschitz constants.IEEE Robotics and Automation Letters, 6(3): 5129–5136, 2021

Craig Knuth, Glen Chou, Necmiye Ozay, and Dmitry Berenson. Planning with learned dynamics: Probabilis- tic guarantees on safety and reachability via Lipschitz constants.IEEE Robotics and Automation Letters, 6(3): 5129–5136, 2021

work page 2021
[24]

Statistical safety and robustness guarantees for feedback motion planning of unknown underactuated stochastic systems

Craig Knuth, Glen Chou, Jamie Reese, and Joseph Moore. Statistical safety and robustness guarantees for feedback motion planning of unknown underactuated stochastic systems. In2023 IEEE International Confer- ence on Robotics and Automation (ICRA), pages 12700– 12706, 2023

work page 2023
[25]

Inverse dynamics trajectory optimization for contact-implicit model predictive control.The Interna- tional Journal of Robotics Research, 45(1):23–40, 2026

Vince Kurtz, Alejandro Castro, Aykut ¨Ozg¨un ¨Onol, and Hai Lin. Inverse dynamics trajectory optimization for contact-implicit model predictive control.The Interna- tional Journal of Robotics Research, 45(1):23–40, 2026

work page 2026
[26]

Single-level differentiable contact simulation.IEEE Robotics and Automation Letters, 8(7):4012–4019, 2023

Simon Le Cleac’h, Mac Schwager, Zachary Manchester, Vikas Sindhwani, Pete Florence, and Sumeet Singh. Single-level differentiable contact simulation.IEEE Robotics and Automation Letters, 8(7):4012–4019, 2023

work page 2023
[27]

Howell, Shuo Yang, Chi- Yen Lee, John Zhang, Arun Bishop, Mac Schwager, and Zachary Manchester

Simon Le Cleac’h, Taylor A. Howell, Shuo Yang, Chi- Yen Lee, John Zhang, Arun Bishop, Mac Schwager, and Zachary Manchester. Fast contact-implicit model predictive control.IEEE Transactions on Robotics, 40: 1617–1629, 2024

work page 2024
[28]

Leeman, Johannes K ¨ohler, Florian Messerer, Amon Lahr, Moritz Diehl, and Melanie N

Antoine P. Leeman, Johannes K ¨ohler, Florian Messerer, Amon Lahr, Moritz Diehl, and Melanie N. Zeilinger. Fast system level synthesis: Robust model predictive control using Riccati recursions.IFAC-PapersOnLine, 58(18): 173–180, 2024

work page 2024
[29]

Leeman, Johannes K ¨ohler, Andrea Zanelli, Samir Bennani, and Melanie N

Antoine P. Leeman, Johannes K ¨ohler, Andrea Zanelli, Samir Bennani, and Melanie N. Zeilinger. Robust non- linear optimal control via system level synthesis.IEEE Transactions on Automatic Control, 70(7):4780–4787, 2025

work page 2025
[30]

Li, Preston Culbertson, Vince Kurtz, and Aaron D

Albert H. Li, Preston Culbertson, Vince Kurtz, and Aaron D. Ames. DROP: Dexterous reorientation via on- line planning. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 14299–14306, 2025

work page 2025
[31]

Limon, J.M

D. Limon, J.M. Bravo, T. Alamo, and E.F. Camacho. Robust MPC of constrained nonlinear systems based on interval arithmetic.IEE Proceedings - Control Theory and Applications, 152:325–332, 2005

work page 2005
[32]

Reynolds, Michael Szmuk, Thomas Lew, Riccardo Bonalli, Marco Pavone, and Behc ¸et Ac ¸ıkmes ¸e

Danylo Malyuta, Taylor P. Reynolds, Michael Szmuk, Thomas Lew, Riccardo Bonalli, Marco Pavone, and Behc ¸et Ac ¸ıkmes ¸e. Convex optimization for trajectory generation: A tutorial on generating dynamically feasi- ble trajectories reliably and efficiently.IEEE Control Systems Magazine, 42(5):40–113, 2022

work page 2022
[33]

Wood, and Scott Kuindersma

Zachary Manchester, Neel Doshi, Robert J. Wood, and Scott Kuindersma. Contact-implicit trajectory optimiza- tion using variational integrators.The International Jour- nal of Robotics Research, 38(12-13):1463–1476, 2019

work page 2019
[34]

Shortest paths in graphs of convex sets

Tobia Marcucci, Jack Umenberger, Pablo Parrilo, and Russ Tedrake. Shortest paths in graphs of convex sets. SIAM Journal on Optimization, 34(1):507–532, 2024

work page 2024
[35]

Mason.Mechanics of Robotic Manipulation

Matthew T. Mason.Mechanics of Robotic Manipulation. The MIT Press, 2001

work page 2001
[36]

Dif- ferentiable collision detection: A randomized smoothing approach

Louis Montaut, Quentin Le Lidec, Antoine Bambade, Vladimir Petrik, Josef Sivic, and Justin Carpentier. Dif- ferentiable collision detection: A randomized smoothing approach. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 3240–3246, 2023

work page 2023
[37]

PODS: Policy optimization via differentiable simulation

Miguel Angel Zamora Mora, Momchil Peychev, Sehoon Ha, Martin Vechev, and Stelian Coros. PODS: Policy optimization via differentiable simulation. InProceed- ings of the 38th International Conference on Machine Learning, pages 7805–7817, 2021

work page 2021
[38]

Non-prehensile planar manipulation via trajectory optimization with complementarity constraints

Jo ˜ao Moura, Theodoros Stouraitis, and Sethu Vijayaku- mar. Non-prehensile planar manipulation via trajectory optimization with complementarity constraints. In2022 International Conference on Robotics and Automation (ICRA), pages 970–976, 2022

work page 2022
[39]

J. Krishna Murthy, Miles Macklin, Florian Golemo, Vikram V oleti, Linda Petrini, Martin Weiss, Brean- dan Considine, J ´erˆome Parent-L ´evesque, Kevin Xie, Kenny Erleben, Liam Paull, Florian Shkurti, Derek Nowrouzezahrai, and Sanja Fidler. gradSim: Differen- tiable simulation for system identification and visuomo- tor control. InInternational Conference ...

work page 2021
[40]

A review of differentiable simulators.IEEE Access, 12: 97581–97604, 2024

Rhys Newbury, Jack Collins, Kerry He, Jiahe Pan, In- gmar Posner, David Howard, and Akansel Cosgun. A review of differentiable simulators.IEEE Access, 12: 97581–97604, 2024

work page 2024
[41]

A convex quasistatic time- stepping scheme for rigid multibody systems with contact and friction

Tao Pang and Russ Tedrake. A convex quasistatic time- stepping scheme for rigid multibody systems with contact and friction. In2021 IEEE International Conference on Robotics and Automation (ICRA), pages 6614–6620, 2021

work page 2021
[42]

Terry Suh, Lujie Yang, and Russ Tedrake

Tao Pang, H.J. Terry Suh, Lujie Yang, and Russ Tedrake. Global planning for contact-rich manipulation via local smoothing of quasi-dynamic contact models.IEEE Transactions on Robotics, 39(6):4691–4711, 2023

work page 2023
[43]

Sim-to-real transfer of robotic control with dynamics randomization

Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, and Pieter Abbeel. Sim-to-real transfer of robotic control with dynamics randomization. In2018 IEEE International Conference on Robotics and Automa- tion (ICRA), pages 3803–3810, 2018

work page 2018
[44]

Sampling-based model predictive control leverag- ing parallelizable physics simulations.IEEE Robotics and Automation Letters, 10(3):2750–2757, 2025

Corrado Pezzato, Chadi Salmi, Elia Trevisan, Max Spahn, Javier Alonso-Mora, and Carlos Hern ´andez Cor- bato. Sampling-based model predictive control leverag- ing parallelizable physics simulations.IEEE Robotics and Automation Letters, 10(3):2750–2757, 2025

work page 2025
[45]

Efficient differentiable simulation of articu- lated bodies

Yi-Ling Qiao, Junbang Liang, Vladlen Koltun, and Ming C Lin. Efficient differentiable simulation of articu- lated bodies. InProceedings of the 38th International Conference on Machine Learning, pages 8661–8671, 2021

work page 2021
[46]

EPOpt: Learning robust neural network policies using model ensembles

Aravind Rajeswaran, Sarvjeet Ghotra, Balaraman Ravin- dran, and Sergey Levine. EPOpt: Learning robust neural network policies using model ensembles. InInternational Conference on Learning Representations, 2017

work page 2017
[47]

Motion planning with sequential convex optimization and convex colli- sion checking.The International Journal of Robotics Research, 33(9):1251–1270, 2014

John Schulman, Yan Duan, Jonathan Ho, Alex Lee, Ibrahim Awwal, Henry Bradlow, Jia Pan, Sachin Patil, Ken Goldberg, and Pieter Abbeel. Motion planning with sequential convex optimization and convex colli- sion checking.The International Journal of Robotics Research, 33(9):1251–1270, 2014

work page 2014
[48]

Proximal Policy Optimization Algorithms

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms.arXiv preprint arXiv:1707.06347, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[49]

Terry Suh, Huaijiang Zhu, Xinpei Ni, Jiuguang Wang, Max Simchowitz, and Tao Pang

Yuki Shirai, Tong Zhao, H.J. Terry Suh, Huaijiang Zhu, Xinpei Ni, Jiuguang Wang, Max Simchowitz, and Tao Pang. Is linear feedback on smoothed dynamics sufficient for stabilizing contact-rich plans? In2025 IEEE Interna- tional Conference on Robotics and Automation (ICRA), page 11926–11932, 2025

work page 2025
[50]

D. C. Sorensen. Newton’s method with a model trust re- gion modification.SIAM Journal on Numerical Analysis, 19(2):409–426, 1982

work page 1982
[51]

H. J. Terry Suh, Tao Pang, Tong Zhao, and Russ Tedrake. Dexterous contact-rich manipulation via the contact trust region.The International Journal of Robotics Research, 2026

work page 2026
[52]

Terry Suh, Tao Pang, and Russ Tedrake

H.J. Terry Suh, Tao Pang, and Russ Tedrake. Bundled gradients through contact via randomized smoothing. IEEE Robotics and Automation Letters, 7(2):4000–4007,

work page
[53]

doi: 10.1109/LRA.2022.3146931

work page doi:10.1109/lra.2022.3146931 2022
[54]

Terry Suh, Max Simchowitz, Kaiqing Zhang, and Russ Tedrake

H.J. Terry Suh, Max Simchowitz, Kaiqing Zhang, and Russ Tedrake. Do differentiable simulators give better policy gradients? InProceedings of the 39th Interna- tional Conference on Machine Learning, pages 20668– 20696, 2022

work page 2022
[55]

Sutton, David McAllester, Satinder Singh, and Yishay Mansour

Richard S. Sutton, David McAllester, Satinder Singh, and Yishay Mansour. Policy gradient methods for reinforce- ment learning with function approximation. InAdvances in Neural Information Processing Systems, volume 12, 1999

work page 1999
[56]

Lieven Vandenberghe.The CVXOPT linear and quadratic cone program solvers, 2010

work page 2010
[57]

Karen Liu

Keenon Werling, Dalton Omens, Jeongseok Lee, Ioannis Exarchos, and C. Karen Liu. Fast and feature-complete differentiable physics for articulated rigid bodies with contact. InProceedings of Robotics: Science and Systems (RSS), 2021

work page 2021
[58]

Accelerated policy learning with parallel dif- ferentiable simulation

Jie Xu, Viktor Makoviychuk, Yashraj Narang, Fabio Ramos, Wojciech Matusik, Animesh Garg, and Miles Macklin. Accelerated policy learning with parallel dif- ferentiable simulation. InInternational Conference on Learning Representations, 2022

work page 2022
[59]

Adaptive barrier smoothing for first-order policy gradient with con- tact dynamics

Shenao Zhang, Wanxin Jin, and Zhaoran Wang. Adaptive barrier smoothing for first-order policy gradient with con- tact dynamics. InProceedings of the 40th International Conference on Machine Learning, pages 41219–41243, 2023. APPENDIXA IMPLICITDIFFERENTIATION OFCONICPROGRAMS The gradient of the solution of problem (1) or (3) with respect to problem dataθca...

work page 2023