CacheMPC: Certified Cached Model Predictive Control for Quadruped Locomotion
Pith reviewed 2026-06-29 03:57 UTC · model grok-4.3
The pith
A certified cache of contact-force trajectories lets MPC for quadrupeds run 25 times faster on embedded hardware with no measurable loss in closed-loop stability.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By storing horizon contact-force trajectories in a locality-sensitive-hashed cache partitioned by contact mode and accepting them only after an a-posteriori certificate verifies primal feasibility together with an upper bound on cost suboptimality, the controller can replace most per-cycle quadratic-program solves with fast retrievals while preserving the safety properties that the original MPC would have enforced.
What carries the argument
Certified CacheMPC: a locality-sensitive-hashed store of horizon contact-force trajectories, partitioned by contact mode and retrieved only when an a-posteriori certificate confirms primal feasibility and a Lagrangian dual-gap bound on suboptimality.
If this is right
- The un-gated cache produces a 25 times median solve-time reduction in simulation and 18.7 times on hardware.
- At n=50 no statistically significant difference appears in closed-loop stable rate between cached and uncached controllers at any tested cell.
- A bounded-budget schedule can safely interleave top-K certified retrievals, a deadline-bounded QP solve, and a shifted last-certified fallback.
- The certificate is required to guarantee that accepted cached trajectories remain feasible and near-optimal.
- Partitioning the cache by contact mode allows separate retrieval logic for each gait phase.
Where Pith is reading between the lines
- The same caching-plus-certificate pattern could be tested on other periodic optimization controllers such as bipedal walking or aerial trajectory tracking.
- If the dual-gap bound proves tight enough in practice, the method might allow trading off cache size against worst-case suboptimality without changing the stability statistics.
- Longer-duration hardware trials on varied terrain would be needed to check whether the bounded-state-space premise continues to hold outside the 2,038-trial distribution.
- The certificate could be applied to other quadratic programs that recur in real-time robotics even if they are not full MPC horizons.
Load-bearing premise
Legged gaits revisit a bounded region of state space often enough that cached MPC solutions remain useful, and the per-query certificate of primal feasibility plus dual-gap bound is sufficient to keep the closed-loop system safe.
What would settle it
A larger campaign at n greater than 50 that finds a statistically significant drop in closed-loop stable rate for any cache variant relative to the no-cache baseline, or a single run in which a certified cached trajectory produces instability.
Figures
read the original abstract
Model Predictive Control (MPC) is the standard predictive layer in hierarchical quadruped controllers, but the per-cycle QP solve limits the update rate achievable on embedded processors. Because legged gaits revisit a bounded region of state space, MPC solutions admit caching and reuse. This paper proposes \emph{Certified CacheMPC}: a Locality-Sensitive-Hashed cache of horizon contact-force trajectories, partitioned by contact mode, retrieved at query time and accepted only when an a-posteriori per-query certificate confirms primal feasibility and a Lagrangian dual-gap upper bound on cost suboptimality. A bounded-budget controller schedule combines top-$K$ certified retrieval, a deadline-bounded QP solve, and a shifted last-certified fallback. The framework is evaluated on a Unitree Go2 across $2{,}038$ usable cold-controller MuJoCo trials, including a $600$-trial $n\!=\!50$ campaign at three failure-boundary cells, and a first-deploy session on the on-robot NVIDIA Orin NX. The un-gated cache delivers a $25\times$ median solve-time speedup in simulation and an $18.7\times$ median speedup on hardware. At $n\!=\!50$ no statistically significant difference in closed-loop stable rate is detected between the cache variants and the no-cache baseline at any tested cell. The certificate's contribution to closed-loop safety is not resolvable at the present sample size.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes CacheMPC, a certified caching approach for Model Predictive Control in quadruped locomotion. It uses locality-sensitive hashing to cache horizon contact-force trajectories partitioned by contact mode, accepting cached solutions only if an a-posteriori certificate verifies primal feasibility and bounds cost suboptimality via Lagrangian dual gap. A controller schedule combines cached retrievals, QP solves, and fallbacks. Evaluation on Unitree Go2 in MuJoCo with 2038 trials and hardware on NVIDIA Orin NX shows 25× median speedup in simulation and 18.7× on hardware, with no statistically significant difference in closed-loop stable rates at n=50 per cell, though the certificate's safety impact is noted as unresolvable at current sample sizes.
Significance. If the empirical results hold, the work demonstrates a practical way to accelerate MPC for legged robots on embedded hardware while attempting to preserve safety through certification. The large number of simulation trials (2038) and real hardware deployment provide substantial empirical evidence for the speedup claims. The approach leverages the repetitive nature of gaits in a bounded state space, which is a reasonable assumption for this domain, and the explicit acknowledgment of sample-size limitations for safety assessment is a strength.
major comments (2)
- [Abstract] Abstract: The headline claim of no statistically significant difference in closed-loop stable rates between cache variants and the no-cache baseline rests on n=50 trials per cell at three failure-boundary cells. This sample size yields low power to detect modest, safety-relevant increases in failure probability; the authors themselves state that the certificate's contribution to closed-loop safety is not resolvable at the present sample size, indicating the same limitation applies to the equivalence conclusion.
- [Abstract] Abstract and method description: No derivation details are provided for the a-posteriori certificate (primal feasibility plus Lagrangian dual-gap upper bound on cost suboptimality). This certificate is load-bearing for the central safety claim when accepting cached trajectories, yet its construction and sufficiency for preserving closed-loop stability are not derived or justified.
minor comments (2)
- Clarify the definition of the 2,038 'usable' cold-controller MuJoCo trials and any exclusion criteria applied before reporting speedups and stability metrics.
- The LSH parameters, cache partitioning by contact mode, and top-K retrieval budget are central to the speedup results; adding explicit values or pseudocode would improve reproducibility.
Simulated Author's Rebuttal
Thank you for the constructive review. We address the two major comments point-by-point below and outline targeted revisions that strengthen the manuscript without altering its core claims.
read point-by-point responses
-
Referee: [Abstract] Abstract: The headline claim of no statistically significant difference in closed-loop stable rates between cache variants and the no-cache baseline rests on n=50 trials per cell at three failure-boundary cells. This sample size yields low power to detect modest, safety-relevant increases in failure probability; the authors themselves state that the certificate's contribution to closed-loop safety is not resolvable at the present sample size, indicating the same limitation applies to the equivalence conclusion.
Authors: We agree that n=50 trials per cell provides limited power to detect modest differences in failure rates and that the existing sample-size caveat for the certificate's safety contribution logically extends to the stability-rate comparison. The manuscript reports the absence of a detected statistically significant difference rather than formal equivalence. We will revise the abstract to employ more precise phrasing (e.g., "no statistically significant difference was detected") and add an explicit note on statistical power in the evaluation section. revision: yes
-
Referee: [Abstract] Abstract and method description: No derivation details are provided for the a-posteriori certificate (primal feasibility plus Lagrangian dual-gap upper bound on cost suboptimality). This certificate is load-bearing for the central safety claim when accepting cached trajectories, yet its construction and sufficiency for preserving closed-loop stability are not derived or justified.
Authors: The methods section constructs the certificate by (i) confirming primal feasibility via direct substitution of the cached contact-force sequence into the QP equality and inequality constraints and (ii) bounding suboptimality by the duality gap obtained from the Lagrangian dual evaluated at the cached solution's associated multipliers. We acknowledge that the current exposition may be too terse for readers new to dual-gap certificates. We will expand the derivation with explicit algebraic steps and add a short paragraph explaining why the bounded suboptimality, together with the bounded-budget controller schedule, is compatible with the observed closed-loop stability. revision: yes
Circularity Check
No significant circularity; claims are direct empirical measurements
full rationale
The paper proposes CacheMPC as an engineering method (LSH cache of contact-force trajectories with a-posteriori primal-feasibility + dual-gap certificate) and evaluates it via direct measurement: 2038 MuJoCo trials plus hardware deployment on Unitree Go2, reporting median solve-time ratios (25× sim, 18.7× hardware) and closed-loop stable rates at n=50 cells. No derivation chain exists that reduces a claimed prediction to a fitted parameter by construction, nor any self-citation load-bearing uniqueness theorem, ansatz smuggling, or renaming of known results. The certificate is defined and applied explicitly; acceptance decisions are measured outcomes, not tautological. The reader's assessment of score 2.0 is consistent with the absence of any quoted reduction of the form "Eq. X = Eq. Y by construction."
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Legged gaits revisit a bounded region of state space so that MPC solutions admit caching and reuse.
Reference graph
Works this paper leans on
-
[1]
Dynamic locomotion in the MIT Cheetah 3 through convex model-predictive control,
J. Di Carlo, P. M. Wensing, B. Katz, G. Bledt, and S. Kim, “Dynamic locomotion in the MIT Cheetah 3 through convex model-predictive control,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS), 2018, pp. 1–9
2018
-
[2]
Mini Cheetah: A platform for pushing the limits of dynamic quadruped control,
B. Katz, J. Di Carlo, and S. Kim, “Mini Cheetah: A platform for pushing the limits of dynamic quadruped control,” inProc. IEEE Int. Conf. Robot. Autom. (ICRA), 2019, pp. 6295–6301
2019
-
[3]
A unified MPC framework for whole-body dynamic locomotion and manipula- tion,
J.-P. Sleiman, F. Farshidian, M. V . Minniti, and M. Hutter, “A unified MPC framework for whole-body dynamic locomotion and manipula- tion,”IEEE Robot. Autom. Lett., vol. 6, no. 3, pp. 4688–4695, 2021
2021
-
[4]
TinyMPC: Model-predictive control on resource- constrained microcontrollers,
K. N. Nguyenet al., “TinyMPC: Model-predictive control on resource- constrained microcontrollers,” inProc. IEEE Int. Conf. Robot. Autom. (ICRA), 2024
2024
-
[5]
MPC-Net: A first principles guided policy search,
B. Carius, R. Ranftl, V . Koltun, and M. Hutter, “MPC-Net: A first principles guided policy search,”IEEE Robot. Autom. Lett., vol. 5, no. 2, pp. 2897–2904, 2020
2020
-
[6]
The explicit linear quadratic regulator for constrained systems,
A. Bemporad, M. Morari, V . Dua, and E. N. Pistikopoulos, “The explicit linear quadratic regulator for constrained systems,”Automatica, vol. 38, no. 1, pp. 3–20, 2002
2002
-
[7]
Approximate nearest neighbors: Towards removing the curse of dimensionality,
P. Indyk and R. Motwani, “Approximate nearest neighbors: Towards removing the curse of dimensionality,” inProc. 30th Annu. ACM Symp. Theory Comput. (STOC), 1998, pp. 604–613
1998
-
[8]
qpOASES: A parametric active-set algorithm for quadratic program- ming,
H. J. Ferreau, C. Kirches, A. Potschka, H. G. Bock, and M. Diehl, “qpOASES: A parametric active-set algorithm for quadratic program- ming,”Math. Program. Comput., vol. 6, no. 4, pp. 327–363, 2014
2014
-
[9]
HPIPM: A high-performance quadratic pro- gramming framework for model predictive control,
G. Frison and M. Diehl, “HPIPM: A high-performance quadratic pro- gramming framework for model predictive control,” inProc. IFAC World Congr., 2020
2020
-
[10]
Perception-less terrain adaptation through whole body control and hierarchical optimization,
C. D. Bellicoso, C. Gehring, J. Hwangbo, P. Fankhauser, and M. Hutter, “Perception-less terrain adaptation through whole body control and hierarchical optimization,” inProc. IEEE-RAS Int. Conf. Humanoid Robots, 2016, pp. 558–564
2016
-
[11]
Online walking motion generation with automatic footstep placement,
A. Herdt, H. Diedam, P.-B. Wieber, D. Dimitrov, K. Mombaur, and M. Diehl, “Online walking motion generation with automatic footstep placement,”Adv. Robot., vol. 24, no. 5–6, pp. 719–737, 2010
2010
-
[12]
A survey of iterative learning control,
D. A. Bristow, M. Tharayil, and A. G. Alleyne, “A survey of iterative learning control,”IEEE Control Syst. Mag., vol. 26, no. 3, pp. 96–114, 2006
2006
-
[13]
MuJoCo: A physics engine for model-based control,
E. Todorov, T. Erez, and Y . Tassa, “MuJoCo: A physics engine for model-based control,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS), 2012, pp. 5026–5033
2012
-
[14]
Suboptimal model predictive control (feasibility implies stability),
P. O. M. Scokaert, D. Q. Mayne, and J. B. Rawlings, “Suboptimal model predictive control (feasibility implies stability),”IEEE Trans. Autom. Control, vol. 44, no. 3, pp. 648–654, 1999
1999
-
[15]
Conditions under which suboptimal nonlinear MPC is inherently robust,
G. Pannocchia, J. B. Rawlings, and S. J. Wright, “Conditions under which suboptimal nonlinear MPC is inherently robust,”Syst. Control Lett., vol. 60, no. 9, pp. 747–755, 2011
2011
-
[16]
MIT Cheetah 3: Design and control of a robust, dynamic quadruped robot,
G. Bledt, M. J. Powell, B. Katz, J. Di Carlo, P. M. Wensing, and S. Kim, “MIT Cheetah 3: Design and control of a robust, dynamic quadruped robot,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS), 2018, pp. 2245–2252
2018
-
[17]
ANYmal — a highly mobile and dynamic quadrupedal robot,
M. Hutter, C. Gehring, D. Jud, A. Lauber, C. D. Bellicoso, V . Tsounis, J. Hwangbo, K. Bodie, P. Fankhauser, M. Bloesch, R. Diethelm, S. Bachmann, A. Melzer, and M. Hoepflinger, “ANYmal — a highly mobile and dynamic quadrupedal robot,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS), 2016, pp. 38–44
2016
-
[18]
Learning agile and dynamic motor skills for legged robots,
J. Hwangbo, J. Lee, A. Dosovitskiy, D. Bellicoso, V . Tsounis, V . Koltun, and M. Hutter, “Learning agile and dynamic motor skills for legged robots,”Sci. Robot., vol. 4, no. 26, p. eaau5872, 2019
2019
-
[19]
Walk these ways: Tuning robot control for generalization with multiplicity of behavior,
G. B. Margolis and P. Agrawal, “Walk these ways: Tuning robot control for generalization with multiplicity of behavior,” inProc. Conf. Robot Learn. (CoRL), 2022, pp. 22–31
2022
-
[20]
Real-time optimization and nonlinear model predictive control of processes governed by differential-algebraic equations,
M. Diehl, H. G. Bock, J. P. Schl ¨oder, R. Findeisen, Z. Nagy, and F. Allg ¨ower, “Real-time optimization and nonlinear model predictive control of processes governed by differential-algebraic equations,”J. Process Control, vol. 12, no. 4, pp. 577–585, 2002
2002
-
[21]
acados— a modular open-source framework for fast embedded optimal control,
R. Verschueren, G. Frison, D. Kouzoupis, J. Frey, N. van Duijkeren, A. Zanelli, B. Novoselnik, T. Albin, R. Quirynen, and M. Diehl, “acados— a modular open-source framework for fast embedded optimal control,” Math. Program. Comput., vol. 14, no. 1, pp. 147–183, 2022
2022
-
[22]
An algorithm for multi-parametric quadratic programming and explicit MPC solutions,
P. Tøndel, T. A. Johansen, and A. Bemporad, “An algorithm for multi-parametric quadratic programming and explicit MPC solutions,” Automatica, vol. 39, no. 3, pp. 489–497, 2003
2003
-
[23]
Learning an approximate model predictive controller with guarantees,
M. Hertneck, J. K ¨ohler, S. Trimpe, and F. Allg ¨ower, “Learning an approximate model predictive controller with guarantees,”IEEE Control Syst. Lett., vol. 2, no. 3, pp. 543–548, 2018
2018
-
[24]
Approximate closed-loop robust model predictive control with guaranteed stability and constraint satisfaction,
J. A. Paulson and A. Mesbah, “Approximate closed-loop robust model predictive control with guaranteed stability and constraint satisfaction,” IEEE Control Syst. Lett., vol. 4, no. 3, pp. 719–724, 2020
2020
-
[25]
Locality-sensitive hashing scheme based onp-stable distributions,
M. Datar, N. Immorlica, P. Indyk, and V . S. Mirrokni, “Locality-sensitive hashing scheme based onp-stable distributions,” inProc. 20th Annu. Symp. Comput. Geom. (SoCG), 2004, pp. 253–262
2004
-
[26]
Near-optimal hashing algorithms for approxi- mate nearest neighbor in high dimensions,
A. Andoni and P. Indyk, “Near-optimal hashing algorithms for approxi- mate nearest neighbor in high dimensions,”Commun. ACM, vol. 51, no. 1, pp. 117–122, 2008
2008
-
[27]
Multidimensional binary search trees used for associative searching,
J. L. Bentley, “Multidimensional binary search trees used for associative searching,”Commun. ACM, vol. 18, no. 9, pp. 509–517, 1975
1975
-
[28]
Synthesis of whole-body behaviors through hierarchical control of behavioral primitives,
L. Sentis and O. Khatib, “Synthesis of whole-body behaviors through hierarchical control of behavioral primitives,”Int. J. Humanoid Robot., vol. 2, no. 4, pp. 505–518, 2005
2005
-
[29]
Highly dynamic quadruped locomotion via whole-body impulse control and model predictive control,
D. Kim, J. Di Carlo, B. Katz, G. Bledt, and S. Kim, “Highly dynamic quadruped locomotion via whole-body impulse control and model predictive control,” arXiv preprint arXiv:1909.06586, 2019
arXiv 1909
-
[30]
Feedback MPC for torque-controlled legged robots,
R. Grandia, F. Farshidian, R. Ranftl, and M. Hutter, “Feedback MPC for torque-controlled legged robots,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS), 2019, pp. 4730–4737
2019
-
[31]
Model predictive path integral control: From theory to parallel computation,
G. Williams, A. Aldrich, and E. A. Theodorou, “Model predictive path integral control: From theory to parallel computation,”J. Guid. Control Dyn., vol. 40, no. 2, pp. 344–357, 2017
2017
-
[32]
Input to state stability of min-max MPC controllers for nonlinear systems with bounded uncertainties,
D. Lim ´on, T. Alamo, F. Salas, and E. F. Camacho, “Input to state stability of min-max MPC controllers for nonlinear systems with bounded uncertainties,”Automatica, vol. 42, no. 5, pp. 797–803, 2006
2006
-
[33]
Con- strained model predictive control: Stability and optimality,
D. Q. Mayne, J. B. Rawlings, C. V . Rao, and P. O. M. Scokaert, “Con- strained model predictive control: Stability and optimality,”Automatica, vol. 36, no. 6, pp. 789–814, 2000
2000
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.