pith. machine review for the scientific record. sign in

arxiv: 2605.07529 · v1 · submitted 2026-05-08 · 📡 eess.SY · cs.SY· math.OC

Recognition: 2 theorem links

· Lean Theorem

Stochastic Differential Dynamic Programming for Trajectory Optimization under Partial Observability

Masahiro Fujiwara , Naoya Ozaki

Authors on Pith no claims yet

Pith reviewed 2026-05-11 01:55 UTC · model grok-4.3

classification 📡 eess.SY cs.SYmath.OC
keywords spacecraft trajectory optimizationstochastic differential dynamic programmingpartial observabilitybelief space planningcovariance controlorbit determinationnavigation-aware trajectories
0
0 comments X

The pith

A stochastic differential dynamic programming algorithm optimizes spacecraft trajectories under partial observability by jointly handling control and belief-state evolution.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a stochastic differential dynamic programming method for designing spacecraft trajectories when states are only partially observable because of execution errors and measurement noise. It optimizes both the reference control sequence and the associated feedback gains while propagating a belief state that includes covariance, and it does so without invoking the separation principle between control and estimation. A sympathetic reader cares because the approach captures how the choice of path itself influences the accuracy of orbit determination, allowing the solver to trade fuel against navigation performance in a single optimization. In the circular restricted three-body problem the resulting navigation-aware plans consume substantially less fuel than deterministic local optimization started from the same guess.

Core claim

The proposed stochastic differential dynamic programming algorithm optimizes the nominal control sequence and feedback gains subject to belief dynamics and general mission constraints, explicitly accounting for the dependence of covariance propagation on the nominal trajectory without relying on the separation principle, and thereby produces navigation-aware and uncertainty-robust solutions.

What carries the argument

Stochastic differential dynamic programming operating on belief dynamics that embed the dependence of covariance evolution on the chosen nominal trajectory.

Load-bearing premise

The belief dynamics must accurately represent how covariance propagates along a given nominal trajectory, and the iterative updates must converge to a point that satisfies all mission constraints.

What would settle it

Apply the algorithm to the circular restricted three-body problem example and compare the final fuel cost against deterministic local optimization begun from the identical initial guess; absence of a clear fuel reduction would falsify the performance advantage.

read the original abstract

Designing spacecraft trajectories remains challenging in the presence of stochastic effects such as maneuver execution errors and observation uncertainties. Although covariance control and belief-space planning provide useful tools for designing robust control policies and information-aware trajectories under uncertainty, practical methods remain limited for partially observable trajectory optimization problems in which trajectory design, orbit determination, and correction maneuver planning are tightly coupled. This paper presents a stochastic differential dynamic programming algorithm for such coupled problems. The proposed method optimizes the nominal control sequence and feedback gains subject to belief dynamics and general mission constraints, explicitly accounting for the dependence of covariance propagation on the nominal trajectory without relying on the separation principle. Numerical examples demonstrate that the proposed algorithm produces navigation-aware and uncertainty-robust solutions across a range of dynamical systems, observation models, and uncertainty levels. In particular, the circular restricted three-body problem shows that the proposed method can exploit the coupling between trajectory design and orbit determination to obtain navigation-aware solutions with substantially lower fuel consumption than those from deterministic local optimization starting from the same initial guess.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces a stochastic differential dynamic programming (SDDP) algorithm for trajectory optimization in partially observable spacecraft problems. It jointly optimizes nominal control sequences and feedback gains subject to belief dynamics (capturing covariance propagation dependent on the nominal trajectory) and general mission constraints, without invoking the separation principle. Numerical examples across dynamical systems and uncertainty levels, including the circular restricted three-body problem (CR3BP), demonstrate navigation-aware solutions with substantially lower fuel consumption than deterministic local optimization from the same initial guess.

Significance. If the convergence properties and constraint satisfaction hold, the approach offers a practical advance for coupled trajectory design and orbit determination under uncertainty, potentially reducing fuel costs in information-aware planning. It extends standard stochastic control tools to non-separated estimation-control problems and provides reproducible numerical validation across multiple systems and observation models. This could influence robust mission design in multi-body dynamics where partial observability is critical.

major comments (2)
  1. [§4, §5] §4 (Algorithm Description) and §5 (Numerical Results): The manuscript lacks an explicit convergence analysis or proof for the SDDP iterations under the coupled belief dynamics. Without this, it is unclear whether the iterations reliably reach feasible points that respect all mission constraints while accurately propagating covariances that depend on the nominal trajectory, particularly in the nonlinear CR3BP where local optima or infeasibility may arise.
  2. [§5.2] §5.2 (CR3BP Example): The reported fuel savings are compared only to deterministic local optimization from the same initial guess. This risks attributing gains primarily to the expanded decision space (joint optimization of controls and gains) rather than the SDDP method itself; additional tests with multiple initial guesses or alternative stochastic baselines are needed to isolate the benefit of exploiting trajectory-OD coupling.
minor comments (2)
  1. [Eq. (12)] Notation for belief-state covariance propagation (e.g., around Eq. (12)) could be clarified to explicitly distinguish dependence on nominal trajectory versus control gains.
  2. [Figure 4] Figure 4 (CR3BP trajectories) would benefit from error bars or multiple-run statistics to illustrate variability under different uncertainty levels.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive review. The comments identify important areas for clarification and strengthening. We address each major comment below and indicate the revisions we will make to the manuscript.

read point-by-point responses
  1. Referee: [§4, §5] §4 (Algorithm Description) and §5 (Numerical Results): The manuscript lacks an explicit convergence analysis or proof for the SDDP iterations under the coupled belief dynamics. Without this, it is unclear whether the iterations reliably reach feasible points that respect all mission constraints while accurately propagating covariances that depend on the nominal trajectory, particularly in the nonlinear CR3BP where local optima or infeasibility may arise.

    Authors: We acknowledge that the manuscript does not contain a formal convergence proof for the SDDP iterations under the coupled belief dynamics. As with many practical DDP implementations in the literature, the method is validated through consistent numerical convergence across the presented examples. In the revised manuscript we will add to Section 4 a discussion of local convergence properties, extending standard DDP analysis to the belief-space setting with trajectory-dependent covariance propagation. We will also include iteration histories (cost and constraint violation) for the CR3BP case in Section 5 to demonstrate reliable progress toward feasible points. A complete theoretical guarantee for the general nonlinear, partially observable case is beyond the scope of this work, which prioritizes algorithmic formulation and empirical validation. revision: partial

  2. Referee: [§5.2] §5.2 (CR3BP Example): The reported fuel savings are compared only to deterministic local optimization from the same initial guess. This risks attributing gains primarily to the expanded decision space (joint optimization of controls and gains) rather than the SDDP method itself; additional tests with multiple initial guesses or alternative stochastic baselines are needed to isolate the benefit of exploiting trajectory-OD coupling.

    Authors: We agree that additional experiments would help isolate the contribution of the SDDP procedure. The deterministic baseline optimizes only the nominal trajectory and therefore does not incorporate belief dynamics or feedback gains; the comparison is intended to illustrate the value of navigation-aware planning. In the revision we will augment Section 5.2 with results from multiple distinct initial guesses for both the deterministic and SDDP methods, confirming that the reported fuel reductions are consistent. We will also clarify that the joint optimization of controls and gains is intrinsic to the SDDP formulation for coupled trajectory-OD problems and that separated stochastic baselines would not capture the same trajectory-dependent information coupling. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation from standard stochastic control principles

full rationale

The paper presents an SDDP algorithm derived from established stochastic dynamic programming and belief-space planning methods, optimizing nominal controls and feedback gains jointly under belief dynamics without separation principle. Numerical examples in CR3BP and other systems validate navigation-aware solutions with lower fuel use compared to deterministic baselines from identical initial guesses. No load-bearing steps reduce by construction to fitted parameters, self-definitions, or self-citation chains; the central claims rest on independent numerical validation rather than tautological reductions. Minor self-citations (if present) are not load-bearing for the fuel-consumption results.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The approach rests on standard stochastic control and dynamic programming assumptions plus domain-specific models for spacecraft dynamics and observations; no new free parameters or invented entities are introduced in the abstract.

axioms (2)
  • domain assumption Belief dynamics and covariance propagation equations hold for the chosen stochastic models
    Required for the optimization to account for uncertainty dependence on trajectory
  • standard math Iterative differential dynamic programming converges under the given constraints
    Implicit in applying DDP to the stochastic belief-space problem

pith-pipeline@v0.9.0 · 5473 in / 1362 out tokens · 29819 ms · 2026-05-11T01:55:59.543450+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages

  1. [1]

    D., Schutz, B

    Tapley, B. D., Schutz, B. E., and Born, G. H.,Statistical Orbit Determination, Academic Press, 2004

  2. [2]

    Precise Orbit Determination of LEO Satellites Based on Undifferenced GNSS Observations,

    Allahvirdi-Zadeh, A., Wang, K., and El-Mowafy, A., “Precise Orbit Determination of LEO Satellites Based on Undifferenced GNSS Observations,”Journal of Surveying Engineering, Vol. 148, No. 1, 2022, p. 03121001. https://doi.org/10.1061/(ASCE) SU.1943-5428.0000382

  3. [3]

    Covariance Control Theory,

    Hotz, A., and Skelton, R. E., “Covariance Control Theory,”International Journal of Control, Vol. 46, No. 1, 1987, pp. 13–32. https://doi.org/10.1080/00207178708933880

  4. [4]

    Stochastic Differential Dynamic Programming with Unscented Transform for Low-Thrust Trajectory Design,

    Ozaki, N., Campagnola, S., Funase, R., and Yam, C. H., “Stochastic Differential Dynamic Programming with Unscented Transform for Low-Thrust Trajectory Design,”Journal of Guidance, Control, and Dynamics, Vol. 41, No. 2, 2018, pp. 377–387. https://doi.org/10.2514/1.G002367

  5. [5]

    TubeStochasticOptimalControlforNonlinearConstrainedTrajectoryOptimization Problems,

    Ozaki,N.,Campagnola,S.,andFunase,R.,“TubeStochasticOptimalControlforNonlinearConstrainedTrajectoryOptimization Problems,”JournalofGuidance,Control,andDynamics,Vol.43,No.4,2020,pp.645–655. https://doi.org/10.2514/1.G004363

  6. [6]

    Robust Space Trajectory Design Using Belief Optimal Control,

    Greco, C., Campagnola, S., and Vasile, M., “Robust Space Trajectory Design Using Belief Optimal Control,”Journal of Guidance, Control, and Dynamics, Vol. 45, No. 6, 2022, pp. 1060–1077. https://doi.org/10.2514/1.G005704

  7. [7]

    Trajectory Optimization under Uncertainty with Nonlinear Programming and Forward–Backward Shooting,

    Varghese, J., and Oguri, K., “Trajectory Optimization under Uncertainty with Nonlinear Programming and Forward–Backward Shooting,”Journal of Guidance, Control, and Dynamics, Vol. 49, No. 1, 2026, pp. 59–77. https://doi.org/10.2514/1.G009259

  8. [8]

    Chance-ConstrainedCovarianceControlforLow-ThrustMinimum-FuelTrajectory Optimization,

    Ridderhof,J.,Pilipovsky,J.,andTsiotras,P.,“Chance-ConstrainedCovarianceControlforLow-ThrustMinimum-FuelTrajectory Optimization,”AAS/AIAA Astrodynamics Specialist Conference, 2020. AAS Paper 20-618

  9. [9]

    Stochastic Sequential Convex Programming for Robust Low-Thrust Trajectory Design under Uncertainty,

    Oguri, K., and Lantoine, G., “Stochastic Sequential Convex Programming for Robust Low-Thrust Trajectory Design under Uncertainty,”AAS/AIAA Astrodynamics Specialist Conference, 2022. AAS Paper 22-708

  10. [10]

    Giaccagli, D

    Rapakoulias, G., and Tsiotras, P., “Discrete-Time Optimal Covariance Steering via Semidefinite Programming,”2023 62nd IEEE Conference on Decision and Control (CDC), 2023, pp. 1802–1807. https://doi.org/10.1109/CDC49753.2023.10384118

  11. [11]

    , Köhler , J

    Pilipovsky,J.,andTsiotras,P.,“ComputationallyEfficientChanceConstrainedCovarianceControlwithOutputFeedback,”2024 IEEE63rdConferenceonDecisionandControl(CDC),2024,pp.677–682. https://doi.org/10.1109/CDC56724.2024.10885876

  12. [12]

    Robust Cislunar Low-Thrust Trajectory Optimization under Uncertainties via Sequential Covariance Steering,

    Kumagai, N., and Oguri, K., “Robust Cislunar Low-Thrust Trajectory Optimization under Uncertainties via Sequential Covariance Steering,”Journal of Guidance, Control, and Dynamics, Vol. 48, No. 12, 2025, pp. 2725–2743. https://doi.org/10. 2514/1.G009092. 41

  13. [13]

    Littman, and Anthony R

    Kaelbling, L. P., Littman, M. L., and Cassandra, A. R., “Planning and Acting in Partially Observable Stochastic Domains,” Artificial Intelligence, Vol. 101, No. 1–2, 1998, pp. 99–134. https://doi.org/10.1016/S0004-3702(98)00023-X

  14. [14]

    Belief Space Planning Assuming Maximum Likelihood Observations,

    Platt, R., Tedrake, R., Kaelbling, L., and Lozano-Pérez, T., “Belief Space Planning Assuming Maximum Likelihood Observations,”Robotics: Science and Systems VI, 2010. https://doi.org/10.15607/RSS.2010.VI.037

  15. [15]

    Motion Planning under Uncertainty Using Iterative Local Optimization in Belief Space,

    van den Berg, J., Patil, S., and Alterovitz, R., “Motion Planning under Uncertainty Using Iterative Local Optimization in Belief Space,”The International Journal of Robotics Research, Vol. 31, No. 11, 2012, pp. 1263–1278. https://doi.org/10.1177/ 0278364912456319

  16. [16]

    Motion Planning under Uncertainty Using Differential Dynamic Programming in Belief Space,

    van den Berg, J., Patil, S., and Alterovitz, R., “Motion Planning under Uncertainty Using Differential Dynamic Programming in Belief Space,”Robotics Research, Springer Tracts in Advanced Robotics, Vol. 100, Springer International Publishing, Cham, 2017, pp. 473–490. https://doi.org/10.1007/978-3-319-29363-9_27

  17. [17]

    Planning in the Continuous Domain: A Generalized Belief Space Approach for Autonomous Navigation in Unknown Environments,

    Indelman, V., Carlone, L., and Dellaert, F., “Planning in the Continuous Domain: A Generalized Belief Space Approach for Autonomous Navigation in Unknown Environments,”The International Journal of Robotics Research, Vol. 34, No. 7, 2015, pp. 849–882. https://doi.org/10.1177/0278364914561102

  18. [18]

    Optimal Active Sensing with Process and Measurement Noise,

    Cognetti, M., Salaris, P., and Robuffo Giordano, P., “Optimal Active Sensing with Process and Measurement Noise,”2018 IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 2118–2125. https://doi.org/10.1109/ICRA.2018. 8460476

  19. [19]

    Online Optimal Perception-Aware Trajectory Generation,

    Salaris, P., Cognetti, M., Spica, R., and Giordano, P. R., “Online Optimal Perception-Aware Trajectory Generation,”IEEE Transactions on Robotics, Vol. 35, No. 6, 2019, pp. 1307–1322. https://doi.org/10.1109/TRO.2019.2931137

  20. [20]

    Stochastic Differential Dynamic Programming,

    Theodorou, E., Tassa, Y., and Todorov, E., “Stochastic Differential Dynamic Programming,”Proceedings of the 2010 American Control Conference, 2010, pp. 1125–1132. https://doi.org/10.1109/ACC.2010.5530971

  21. [21]

    Dimarogonas

    Yi, Z., Cao, Z., Theodorou, E., and Chen, Y., “Nonlinear Covariance Control via Differential Dynamic Programming,”2020 American Control Conference, 2020, pp. 3571–3576. https://doi.org/10.23919/ACC45564.2020.9147531

  22. [22]

    Observability-Aware Differential Dynamic Programming with Impulsive Maneuvers,

    Fujiwara, M., and Funase, R., “Observability-Aware Differential Dynamic Programming with Impulsive Maneuvers,”Journal of Guidance, Control, and Dynamics, Vol. 47, No. 9, 2024, pp. 1905–1919. https://doi.org/10.2514/1.G007798

  23. [23]

    Stochastic Differential Dynamic Programming under Coupled Control and Observation,

    Fujiwara, M., and Ozaki, N., “Stochastic Differential Dynamic Programming under Coupled Control and Observation,” AAS/AIAA Astrodynamics Specialist Conference, 2024. AAS Paper 24-424

  24. [24]

    Overlapping

    Ridderhof, J., Okamoto, K., and Tsiotras, P., “Chance-Constrained Covariance Control for Linear Stochastic Systems with Output Feedback,”Proceedings of the 59th IEEE Conference on Decision and Control, 2020, pp. 1758–1763. https://doi.org/10.1109/CDC42340.2020.9303731

  25. [25]

    Nonlinear Estimation with State-Dependent Gaussian Observation Noise,

    Spinello, D., and Stilwell, D. J., “Nonlinear Estimation with State-Dependent Gaussian Observation Noise,”IEEE Transactions on Automatic Control, Vol. 55, No. 6, 2010, pp. 1358–1366. https://doi.org/10.1109/TAC.2010.2042006. 42

  26. [26]

    The Epoch-State Filter, Revisited,

    Carpenter, J. R., “The Epoch-State Filter, Revisited,”Journal of Guidance, Control, and Dynamics, Vol. 46, No. 7, 2023, pp. 1228–1242. https://doi.org/10.2514/1.G007330

  27. [27]

    A Hybrid Differential Dynamic Programming Algorithm for Constrained Optimal Control Problems. Part 1: Theory,

    Lantoine, G., and Russell, R. P., “A Hybrid Differential Dynamic Programming Algorithm for Constrained Optimal Control Problems. Part 1: Theory,”Journal of Optimization Theory and Applications, Vol. 154, No. 2, 2012, pp. 382–417. https://doi.org/10.1007/s10957-012-0039-0

  28. [28]

    Differential Dynamic Programming with Nonlinear Constraints,

    Xie, Z., Liu, C. K., and Hauser, K., “Differential Dynamic Programming with Nonlinear Constraints,”2017 IEEE International Conference on Robotics and Automation, 2017, pp. 695–702. https://doi.org/10.1109/ICRA.2017.7989086

  29. [29]

    Interior Point Differential Dynamic Programming,

    Pavlov, A., Shames, I., and Manzie, C., “Interior Point Differential Dynamic Programming,”IEEE Transactions on Control Systems Technology, Vol. 29, No. 6, 2021, pp. 2720–2727. https://doi.org/10.1109/TCST.2021.3049416

  30. [30]

    and Ayanian, Nora and Sukhatme, Gaurav S

    Howell, T. A., Jackson, B. E., and Manchester, Z., “ALTRO: A Fast Solver for Constrained Trajectory Optimization,” 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019, pp. 7674–7679. https://doi.org/10.1109/ IROS40897.2019.8967788

  31. [31]

    A Multiple-Shooting Differential Dynamic Programming Algorithm. Part 1: Theory,

    Pellegrini, E., and Russell, R. P., “A Multiple-Shooting Differential Dynamic Programming Algorithm. Part 1: Theory,”Acta Astronautica, Vol. 170, 2020, pp. 686–700. https://doi.org/10.1016/j.actaastro.2019.12.037

  32. [32]

    DifferentialEquations.jl—A Performant and Feature-Rich Ecosystem for Solving Differential Equations in Julia,

    Rackauckas, C., and Nie, Q., “DifferentialEquations.jl—A Performant and Feature-Rich Ecosystem for Solving Differential Equations in Julia,”Journal of Open Research Software, Vol. 5, No. 1, 2017, p. 15. https://doi.org/10.5334/jors.151

  33. [33]

    JuMP 1.0: recent improvements to a modeling language for mathematical optimization

    Lubin, M., Dowson, O., Dias Garcia, J., Huchette, J., Legat, B., and Vielma, J. P., “JuMP 1.0: Recent Improvements to a Modeling Language for Mathematical Optimization,”Mathematical Programming Computation, Vol. 15, No. 3, 2023, pp. 581–589. https://doi.org/10.1007/s12532-023-00239-3

  34. [34]

    Clarabel: An interior-point solver for conic programs with quadratic objectives,

    Goulart, P. J., and Chen, Y., “Clarabel: An Interior-Point Solver for Conic Programs with Quadratic Objectives,”arXiv preprint arXiv:2405.12762, 2024. https://doi.org/10.48550/arXiv.2405.12762

  35. [35]

    A Simplified Model of Midcourse Maneuver Execution Errors,

    Gates, C. R., “A Simplified Model of Midcourse Maneuver Execution Errors,” Tech. Rep. 32-504, Jet Propulsion Laboratory, Pasadena, CA, 1963

  36. [36]

    HybridDifferentialDynamicProgrammingintheCircularRestrictedThree-Body Problem,

    Aziz,J.D.,Scheeres,D.J.,andLantoine,G.,“HybridDifferentialDynamicProgrammingintheCircularRestrictedThree-Body Problem,”Journal of Guidance, Control, and Dynamics, Vol. 42, No. 5, 2019, pp. 963–975. https://doi.org/10.2514/1.G003617

  37. [37]

    M., et al., 2023, MNRAS, 524, 5109 Revels J., Lubin M., Papamarkou T., 2016, arXiv:1607.07892 [cs.MS] Robnik J., Seljak U., 2023, arXiv e-prints, p

    Revels,J.,Lubin,M.,andPapamarkou,T.,“Forward-ModeAutomaticDifferentiationinJulia,”arXivpreprintarXiv:1607.07892,

  38. [38]

    https://doi.org/10.48550/arXiv.1607.07892

  39. [39]

    B., and Pedersen, M

    Petersen, K. B., and Pedersen, M. S.,The Matrix Cookbook, 2012. Version 20121115. 43