Featurized Occupation Measures for Structured Global Search in Numerical Optimal Control
Pith reviewed 2026-05-15 10:28 UTC · model grok-4.3
The pith
Featurized occupation measures provide a primal-dual interface that lets HJB subsolutions guide scalable global search in optimal control.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Featurized Occupation Measures form a finite-dimensional primal-dual interface that couples numerical optimal control solvers with explicit HJB subsolutions. The explicit realization employs finite weak-form Liouville tests while the implicit realization pairs rollout-based search with sampled primal-dual residuals. Both are asymptotically consistent with the exact occupation-measure linear program under refinement. For factor graphs induced by compatible passivity-based interconnections, blockwise HJB inequalities assemble into globally feasible OM-dual certificates whose decomposition is preserved under blockwise approximation.
What carries the argument
Featurized Occupation Measures, a finite-dimensional primal-dual interface that couples solvers with HJB subsolutions via features and weak-form tests or sampled residuals.
If this is right
- Both realizations converge asymptotically to the exact occupation-measure linear program under refinement.
- Blockwise HJB inequalities assemble into globally feasible OM-dual certificates for passivity-based factor graphs.
- The blockwise decomposition is preserved under approximation, shifting the curse of dimensionality to interconnection topology.
- Approximate certificates remain reusable under time shifts and bounded perturbations with explicit degradation bounds.
- Certificates of increasing tightness guide sample-based optimizers toward global optima on static obstacle-avoidance benchmarks.
Where Pith is reading between the lines
- The approach could scale global search to high-dimensional modular systems by exploiting interconnection structure rather than state dimension.
- Certificate reusability under perturbations suggests direct use in receding-horizon or adaptive control loops.
- Similar featurization might extend to other decompositions beyond passivity-based interconnections.
- One could test whether asymptotic consistency persists when blockwise assembly is only approximate for non-compatible graphs.
Load-bearing premise
The chosen features and weak-form tests or sampled residuals capture enough global information to guide search, and the passivity-based interconnection structure permits exact blockwise assembly of certificates.
What would settle it
A concrete falsifier is the observation that blockwise-assembled certificates on a passivity-interconnected system violate global dual feasibility conditions of the occupation-measure program, or that refined certificates fail to steer a sample-based optimizer to the known global optimum on the static obstacle-avoidance benchmark.
Figures
read the original abstract
Numerical optimal control has long been split between globally structured but dimensionally intractable Hamilton--Jacobi--Bellman (HJB) methods and scalable but local trajectory optimization. We introduce Featurized Occupation Measures (FOM), a finite-dimensional primal--dual interface for coupling numerical optimal control solvers with explicit HJB subsolutions: the certificate guides the primal search, while primal residuals tighten the certificate in a primal-dual language. Two realizations are developed. The explicit realization uses finite weak-form Liouville tests, and the implicit realization couples rollout-based search with sampled primal--dual residuals. Both are proved asymptotically consistent with the exact occupation-measure linear program under refinement, separating primal expressiveness from dual accuracy in the limit. The framework also gives structural conditions under which HJB-type certificates avoid full state-space representation. For factor graphs induced by compatible passivity-based interconnections, blockwise HJB inequalities assemble into globally feasible OM-dual certificates, and the decomposition is preserved under blockwise approximation. The curse of dimensionality is then shifted from state space to interconnection topology. Approximate certificates remain reusable under time shifts and bounded model perturbations, with explicit degradation bounds. On a static obstacle-avoidance benchmark, certificates of increasing tightness guide a sample-based optimizer toward global optima, confirming that even a coarse certificate carries useful global information.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Featurized Occupation Measures (FOM) as a finite-dimensional primal-dual interface coupling numerical optimal control solvers with HJB subsolutions. Two realizations are developed: explicit (finite weak-form Liouville tests) and implicit (rollout-based search with sampled residuals). Both are proved asymptotically consistent with the exact occupation-measure LP under refinement. Structural conditions are provided under which, for factor graphs from compatible passivity-based interconnections, blockwise HJB inequalities assemble into globally feasible OM-dual certificates, shifting the curse of dimensionality to topology. Approximate certificates are reusable under time shifts and perturbations with explicit bounds. The method is illustrated on a static obstacle-avoidance benchmark where increasing certificate tightness guides a sample-based optimizer to global optima.
Significance. If the consistency proofs and structural decomposition hold, this provides a substantive advance in bridging globally structured HJB methods with scalable local trajectory optimization. The ability to assemble certificates blockwise via passivity structure, together with reusability bounds, offers a concrete mechanism for dimensionality reduction without full state-space representation. The benchmark demonstration that even coarse certificates carry useful global information strengthens the practical case.
major comments (2)
- [Abstract / structural conditions] Abstract / structural conditions paragraph: The claim that 'for factor graphs induced by compatible passivity-based interconnections, blockwise HJB inequalities assemble into globally feasible OM-dual certificates, and the decomposition is preserved under blockwise approximation' is load-bearing for the dimensionality-reduction argument. The obstacle-avoidance benchmark is presented as confirmation, yet no explicit verification is given that this example satisfies the passivity compatibility condition or that approximation errors do not accumulate across blocks. Without this check the global feasibility guarantee remains conditional.
- [Consistency proofs section] Consistency claims (both realizations): The asymptotic consistency with the exact OM LP is stated as proved, separating primal expressiveness from dual accuracy. However, the manuscript must explicitly address whether blockwise approximation errors remain controlled when the passivity interconnection is only approximately satisfied, as this directly affects the claim that the curse shifts from state space to topology.
minor comments (2)
- [Introduction / realizations] Clarify the precise definition of 'featurized' occupation measures and the choice of weak-form test functions in the explicit realization; the current description leaves open how the features are selected to capture global information without full state representation.
- [Reusability subsection] The reusability bounds for approximate certificates under time shifts and model perturbations are stated with explicit degradation; include a short numerical illustration of the bound tightness on the benchmark to aid readability.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed report. We address the major comments point by point below and will revise the manuscript accordingly to strengthen the presentation of the structural conditions and consistency results.
read point-by-point responses
-
Referee: [Abstract / structural conditions] The claim that blockwise HJB inequalities assemble into globally feasible OM-dual certificates for passivity-based interconnections is load-bearing, but the benchmark lacks explicit verification of the passivity compatibility condition and that approximation errors do not accumulate across blocks.
Authors: We agree that explicit verification is required to support the dimensionality-reduction claim. In the revised manuscript we will add a dedicated subsection verifying that the static obstacle-avoidance benchmark satisfies the compatible passivity-based interconnection assumptions. We will also supply a direct check (via the factor-graph structure and passivity indices) confirming that blockwise approximation errors remain controlled and do not accumulate to violate global feasibility, rendering the guarantee unconditional for the reported example. revision: yes
-
Referee: [Consistency proofs section] The manuscript must explicitly address whether blockwise approximation errors remain controlled when the passivity interconnection is only approximately satisfied.
Authors: The existing consistency proofs are stated under exact satisfaction of the passivity interconnection. We acknowledge the need for an explicit treatment of the approximate case. The revision will include a new paragraph in the consistency section that derives quantitative bounds on the growth of blockwise errors under small perturbations of the interconnection operators, together with the precise conditions under which the shift of the curse of dimensionality from state space to topology remains valid with controlled degradation. revision: yes
Circularity Check
No significant circularity; claims rest on stated separate proofs and domain assumptions
full rationale
The paper explicitly states that asymptotic consistency with the exact occupation-measure LP is proved separately for both realizations, and the blockwise HJB-to-OM certificate assembly is conditioned on passivity-based interconnection compatibility as a structural assumption rather than a fitted or self-defined quantity. No equation or claim reduces by construction to its own inputs, no parameter is fitted to a subset and renamed as prediction, and no load-bearing step collapses to a self-citation chain. The framework therefore remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Occupation measures admit finite weak-form Liouville tests that preserve asymptotic consistency under refinement.
- domain assumption Passivity-based interconnections induce factor graphs where blockwise HJB inequalities assemble into globally feasible dual certificates.
invented entities (1)
-
Featurized Occupation Measures (FOM)
no independent evidence
Reference graph
Works this paper leans on
-
[1]
J. T. Betts,Practical methods for optimal control and estimation using nonlinear programming. SIAM, 2010
work page 2010
-
[2]
A direct method for trajectory optimization of rigid bodies through contact,
M. Posa, C. Cantu, and R. Tedrake, “A direct method for trajectory optimization of rigid bodies through contact,”The International Journal of Robotics Research, vol. 33, no. 1, pp. 69–81, 2014
work page 2014
-
[3]
Footstep planning on uneven terrain with mixed-integer convex optimization,
R. Deits and R. Tedrake, “Footstep planning on uneven terrain with mixed-integer convex optimization,” in2014 IEEE-RAS international conference on humanoid robots. IEEE, 2014, pp. 279–286
work page 2014
-
[4]
M. Zhang, D. K. Jha, A. U. Raghunathan, and K. Hauser, “Simultaneous trajectory optimization and contact selection for contact-rich manipula- tion with high-fidelity geometry,”IEEE Transactions on Robotics, 2025
work page 2025
-
[5]
T. Bas ¸ar and G. J. Olsder,Dynamic noncooperative game theory. SIAM, 1998
work page 1998
-
[6]
Blending data-driven priors in dynamic games,
J. Lidard, H. Hu, A. Hancock, Z. Zhang, A. G. Contreras, V . Modi, J. DeCastro, D. Gopinath, G. Rosman, N. E. Leonardet al., “Blending data-driven priors in dynamic games,”arXiv preprint arXiv:2402.14174, 2024
-
[7]
Multi-agent guided policy search for non-cooperative dynamic games.arXiv preprint arXiv:2509.24226,
J. Li, G. Qu, J. J. Choi, S. Sojoudi, and C. Tomlin, “Multi-agent guided policy search for non-cooperative dynamic games,”arXiv preprint arXiv:2509.24226, 2025
- [8]
-
[9]
W. H. Fleming and H. M. Soner,Controlled Markov processes and viscosity solutions. Springer, 2006
work page 2006
-
[10]
Bertsekas,Dynamic programming and optimal control: Volume I
D. Bertsekas,Dynamic programming and optimal control: Volume I. Athena scientific, 2012, vol. 4
work page 2012
-
[11]
R. Bellman, “Dynamic programming,”science, vol. 153, no. 3731, pp. 34–37, 1966
work page 1966
-
[12]
Suboptimal feedback control of pdes by solving hjb equations on adaptive sparse grids,
J. Garcke and A. Kr ¨oner, “Suboptimal feedback control of pdes by solving hjb equations on adaptive sparse grids,”Journal of Scientific Computing, vol. 70, no. 1, pp. 1–28, 2017
work page 2017
-
[13]
An adap- tive sparse grid semi-lagrangian scheme for first order hamilton-jacobi bellman equations,
O. Bokanowski, J. Garcke, M. Griebel, and I. Klompmaker, “An adap- tive sparse grid semi-lagrangian scheme for first order hamilton-jacobi bellman equations,”Journal of Scientific Computing, vol. 55, no. 3, pp. 575–605, 2013
work page 2013
-
[14]
W. Kang and L. C. Wilcox, “Mitigating the curse of dimensionality: sparse grid characteristics method for optimal feedback control and hjb equations,”Computational Optimization and Applications, vol. 68, no. 2, pp. 289–315, 2017
work page 2017
-
[15]
Tensor decomposition meth- ods for high-dimensional hamilton–jacobi–bellman equations,
S. Dolgov, D. Kalise, and K. K. Kunisch, “Tensor decomposition meth- ods for high-dimensional hamilton–jacobi–bellman equations,”SIAM Journal on Scientific Computing, vol. 43, no. 3, pp. A1625–A1650, 2021
work page 2021
-
[16]
Linear hamilton jacobi bellman equations in high dimensions,
M. B. Horowitz, A. Damle, and J. W. Burdick, “Linear hamilton jacobi bellman equations in high dimensions,” in53rd IEEE Conference on Decision and Control. IEEE, 2014, pp. 5880–5887
work page 2014
-
[17]
J. Darbon and S. Osher, “Algorithms for overcoming the curse of dimensionality for certain hamilton–jacobi equations arising in control theory and elsewhere,”Research in the Mathematical Sciences, vol. 3, no. 1, p. 19, 2016
work page 2016
-
[18]
Y . T. Chow, J. Darbon, S. Osher, and W. Yin, “Algorithm for over- coming the curse of dimensionality for state-dependent hamilton-jacobi equations,”Journal of Computational Physics, vol. 387, pp. 376–409, 2019
work page 2019
-
[19]
L. S. Pontryagin,The Mathematical Theory of Optimal Processes. John Wiley, 1963
work page 1963
-
[20]
A proof of a local maximum principle for optimal control problems with mixed state constraints,
P. Bosch and J. G ´omez, “A proof of a local maximum principle for optimal control problems with mixed state constraints,”Rev Invest Oper Braz, vol. 9, no. 3, pp. 239–262, 2000
work page 2000
-
[21]
D. Mayne, “A second-order gradient method for determining optimal trajectories of non-linear discrete-time systems,”International Journal of Control, vol. 3, no. 1, pp. 85–95, 1966
work page 1966
-
[22]
Differential dynamic programming,
D. H. Jacobson and D. Q. Mayne, “Differential dynamic programming,” Elsevier Press, 1970
work page 1970
-
[23]
Control-limited differential dynamic programming,
Y . Tassa, N. Mansard, and E. Todorov, “Control-limited differential dynamic programming,” in2014 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2014, pp. 1168–1175
work page 2014
-
[24]
Iterative linear quadratic regulator design for nonlinear biological movement systems,
W. Li and E. Todorov, “Iterative linear quadratic regulator design for nonlinear biological movement systems,” inFirst International Con- ference on Informatics in Control, Automation and Robotics, vol. 2. SciTePress, 2004, pp. 222–229
work page 2004
-
[25]
Linear theory for control of nonlinear stochastic sys- tems,
H. J. Kappen, “Linear theory for control of nonlinear stochastic sys- tems,”Physical review letters, vol. 95, no. 20, p. 200201, 2005
work page 2005
-
[26]
A generalized path integral control approach to reinforcement learning,
E. Theodorou, J. Buchli, and S. Schaal, “A generalized path integral control approach to reinforcement learning,”The Journal of Machine Learning Research, vol. 11, pp. 3137–3181, 2010
work page 2010
-
[27]
Model predictive path integral control: From theory to parallel computation,
G. Williams, A. Aldrich, and E. A. Theodorou, “Model predictive path integral control: From theory to parallel computation,”Journal of Guidance, Control, and Dynamics, vol. 40, no. 2, pp. 344–357, 2017. 15
work page 2017
-
[28]
Information-theoretic model predictive control: Theory and applications to autonomous driving,
G. Williams, P. Drews, B. Goldfain, J. M. Rehg, and E. A. Theodorou, “Information-theoretic model predictive control: Theory and applications to autonomous driving,”IEEE Transactions on Robotics, vol. 34, no. 6, pp. 1603–1622, 2018
work page 2018
-
[29]
Lao: A heuristic search algorithm that finds solutions with loops,
E. A. Hansen and S. Zilberstein, “Lao: A heuristic search algorithm that finds solutions with loops,”Artificial Intelligence, vol. 129, no. 1-2, pp. 35–62, 2001
work page 2001
-
[30]
B. Lincoln and A. Rantzer, “Relaxing dynamic programming,”IEEE Transactions on Automatic Control, vol. 51, no. 8, pp. 1249–1260, 2006
work page 2006
-
[31]
Design of Admissible Heuristics for Kinodynamic Motion Planning via Sum-of-Squares Programming
B. Paden, V . Varriccho, and E. Frazzoli, “Design of admissible heuristics for kinodynamic motion planning via sum-of-squares programming,” arXiv preprint arXiv:1609.06277, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[32]
Convex duality and nonlinear optimal control,
R. Vinter, “Convex duality and nonlinear optimal control,”SIAM journal on control and optimization, vol. 31, no. 2, pp. 518–538, 1993
work page 1993
-
[33]
Warga,Optimal control of differential and functional equations
J. Warga,Optimal control of differential and functional equations. Academic press, 2014
work page 2014
-
[34]
Nonlinear optimal control via occupation measures and lmi-relaxations,
J. B. Lasserre, D. Henrion, C. Prieur, and E. Tr ´elat, “Nonlinear optimal control via occupation measures and lmi-relaxations,”SIAM journal on control and optimization, vol. 47, no. 4, pp. 1643–1666, 2008
work page 2008
-
[35]
The linear programming approach to approximate dynamic programming,
D. P. De Farias and B. Van Roy, “The linear programming approach to approximate dynamic programming,”Operations research, vol. 51, no. 6, pp. 850–865, 2003
work page 2003
-
[36]
Information relaxations and duality in stochastic dynamic programs: A review and tutorial,
D. B. Brown and J. E. Smith, “Information relaxations and duality in stochastic dynamic programs: A review and tutorial,”Foundations and Trends in Optimization, vol. 5, no. 3, pp. 246–339, 2022
work page 2022
-
[37]
A. Kamoutsi, T. Sutter, P. M. Esfahani, and J. Lygeros, “On infinite linear programming and the moment approach to deterministic infinite horizon discounted optimal control problems,”IEEE control systems letters, vol. 1, no. 1, pp. 134–139, 2017
work page 2017
-
[38]
Positivity certificates in op- timal control,
E. Pauwels, D. Henrion, and J.-B. Lasserre, “Positivity certificates in op- timal control,” inGeometric and Numerical Foundations of Movements. Springer, 2017, pp. 113–131
work page 2017
-
[39]
Optimistic monte carlo tree search with sampled information relaxation dual bounds,
D. R. Jiang, L. Al-Kanj, and W. B. Powell, “Optimistic monte carlo tree search with sampled information relaxation dual bounds,”Operations Research, vol. 68, no. 6, pp. 1678–1697, 2020
work page 2020
-
[40]
Minimal-time nonlinear control via semi- infinite programming,
A. Oustry and M. Tacchi, “Minimal-time nonlinear control via semi- infinite programming,”arXiv preprint arXiv:2307.00857, 2023
-
[41]
F. Holtorf, A. Edelman, and C. Rackauckas, “Stochastic optimal control via local occupation measures,”arXiv preprint arXiv:2211.15652, 2022
-
[42]
Proximal Policy Optimization Algorithms
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Prox- imal policy optimization algorithms,”arXiv preprint arXiv:1707.06347, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[43]
Deterministic policy gradient algorithms,
D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller, “Deterministic policy gradient algorithms,” inInternational conference on machine learning. Pmlr, 2014, pp. 387–395
work page 2014
-
[44]
Path integral policy improvement: An information-geometric optimization approach,
P. Varnai and D. V . Dimarogonas, “Path integral policy improvement: An information-geometric optimization approach,” 2020
work page 2020
-
[45]
Q-guided stein variational model predictive control via rl-informed policy prior,
S. Cai, Z. Yin, J. Jacob, and F. Ramos, “Q-guided stein variational model predictive control via rl-informed policy prior,” 2026. [Online]. Available: https://arxiv.org/abs/2507.06625
-
[46]
Holtorf,Bounds and low-rank approximation for controlled Markov processes
F. Holtorf,Bounds and low-rank approximation for controlled Markov processes. Massachusetts Institute of Technology, 2024
work page 2024
-
[47]
R. Sepulchre, M. Jankovi ´c, and P. V . Kokotovi´c,Constructive nonlinear control. Springer Science & Business Media, 1997
work page 1997
-
[48]
Constructive nonlinear control: a historical perspective,
P. Kokotovi ´c and M. Arcak, “Constructive nonlinear control: a historical perspective,”Automatica, vol. 37, no. 5, pp. 637–662, 2001. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S0005109801000024
work page 2001
-
[49]
Examples for separable control lyapunov functions and their neural network approximation,
L. Gr ¨une and M. Sperl, “Examples for separable control lyapunov functions and their neural network approximation,”IFAC-PapersOnLine, vol. 56, no. 1, pp. 19–24, 2023, 12th IFAC Symposium on Nonlinear Control Systems NOLCOS 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2405896323001921
work page 2023
-
[50]
Separable approxima- tions of optimal value functions under a decaying sensitivity assump- tion,
M. Sperl, L. Saluzzi, L. Gr ¨une, and D. Kalise, “Separable approxima- tions of optimal value functions under a decaying sensitivity assump- tion,” in2023 62nd IEEE Conference on Decision and Control (CDC), 2023, pp. 259–264
work page 2023
-
[51]
Compositionality of optimal control laws,
E. Todorov, “Compositionality of optimal control laws,”Advances in neural information processing systems, vol. 22, 2009
work page 2009
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.