arxiv: 2605.07038 · v1 · submitted 2026-05-07 · 💻 cs.LG · cs.MA· cs.RO

Recognition: 2 theorem links

· Lean Theorem

Learning Material-Aware Hamiltonian Risk Fields for Safe Navigation

Aditya Sai Ellendula , Yi Wang , Chandrajit Bajaj

Authors on Pith no claims yet

Pith reviewed 2026-05-11 00:58 UTC · model grok-4.3

classification 💻 cs.LG cs.MAcs.RO

keywords risk-aware navigationport-Hamiltonian systemscontext energyCVaRsafe navigationforce fieldsmaterial-aware risk

0 comments

The pith

Adding one context-energy term to a port-Hamiltonian navigation policy yields a force channel that activates toward lower-risk directions only when they are feasible and suppresses them otherwise.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to show that risk-aware navigation can be made selective by structure rather than by ad-hoc tuning. It adds a single context-energy term to an existing port-Hamiltonian policy so that the resulting force field points toward a safer escape route precisely when such a route exists locally and stays silent when the apparent escape is blocked. A CVaR objective concentrates learning on rare high-risk transitions. Experiments across simulated escape benchmarks, real off-road terrain, semantic maps, and highway traffic confirm that the selectivity property holds and improves success rates while reducing premature or false maneuvers.

Core claim

Adding one context-energy term to a port-Hamiltonian navigation policy produces a learned force channel whose gradient structure automatically enforces a falsifiable selectivity signature: the context force activates toward a feasible lower-risk direction when one exists and a route-aware gate suppresses lateral force when the escape is blocked or unavailable.

What carries the argument

The context-energy term and its gradient, together with the route-aware gate, inside the port-Hamiltonian dynamics.

If this is right

In delayed-required-escape scenarios the method cuts premature force activation from 0.95 to 0.18 and raises success from 0.48 to 0.81 with zero replans.
On real off-road terrain it reaches 0.837 correct activation and 0.114 false activation versus 0.378/0.752 for scalar risk gradients.
On static semantic maps it drops catastrophic failure from 0.60 to 0.10 and reduces oscillation by 90.7 percent while keeping path length comparable.
In highway traffic it eliminates all collisions when a lane escape is feasible and suppresses the lateral command when no escape exists.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same gradient-structure argument could be ported to other Hamiltonian or energy-based controllers without requiring new training data.
The selectivity property may reduce the need for separate safety filters or replanning layers in real-time navigation stacks.
CVaR focusing on tail risk could be combined with other risk measures if the context-energy term is kept fixed.

Load-bearing premise

The learned context-energy term and its gradient will reliably produce the claimed selectivity across new environments without hidden post-hoc adjustments that break the structural guarantee.

What would settle it

A controlled test in which a lower-risk escape route is physically blocked yet the measured lateral force still activates or, conversely, an open escape route is present yet the force remains suppressed.

Figures

Figures reproduced from arXiv: 2605.07038 by Aditya Sai Ellendula, Chandrajit Bajaj, Yi Wang.

**Figure 1.** Figure 1: Selective reshaping of the decision field. (A) Geometrically feasible maneuvers can differ in material risk. (B) Adding −τ∇qHctx to the cotangent update creates a context-force channel. (C) The channel bends toward a safer lane when one is feasible, but remains negligible when escape is boxed in. Sec. 4 measures this activation/suppression signature directly. context force channel in the momentum update, (… view at source ↗

**Figure 2.** Figure 2: Factored stored energy and induced force channels. Hθ separates kinetic, geometric, dissipative, and context terms. The context term creates a soft-risk deflection channel and a hardhazard repulsion channel. The route-aware gate lets the soft channel shift the field only when a feasible lower-risk maneuver exists; otherwise the rollout stays near the geometry-only policy. Route-aware soft-channel gate. Gi… view at source ↗

**Figure 3.** Figure 3: Gate specification in the main method. The gate converts a risk-map cue into force activation only when the current local patch contains a cleared, traversable primitive that improves soft risk by margin ρR. 3.3 Tail-risk objective Each rollout accumulates J(θ) = wg∥qT −qg∥ 2 +wℓ P t ∥qt+1 −qt∥+wr P t r˜(qt)∥qt+1 −qt∥+wh P t 1[ϕ(qt) < ϵ]. (6) Because the relevant failures are rare, expected-cost training c… view at source ↗

**Figure 4.** Figure 4: Qualitative temporal selectivity in one delayed-required escape episode. Yellow dashed trajectories/arrows show behavior before the escape is available; solid colored trajectories/arrows show behavior after it opens. The geometry-only policy ignores the material update, DWA and blackbox CVaR move before the escape is feasible and then stall, while route-aware context enrichment suppresses before tescape a… view at source ↗

**Figure 5.** Figure 5: Three loops as three distinct computational jobs. The segment loop corrects active coefficients per step. The episode loop optimizes meta-parameters via CVaR. The curriculum loop advances training phases statistically. A.8 Context-enriched training algorithm A.9 Energy enrichment induces force and sensitivity channels Lemma 1 (Energy enrichment induces force, sensitivity, and excitation channels). Let Henr… view at source ↗

**Figure 6.** Figure 6: Independent DFC path panels (episode 0124). Each panel shows one method’s trajectory on the same episode. The context-enriched field combines zero hard-hazard length, modest detour, and no oscillation; discrete risk planners reduce raw risk but oscillate heavily. 30 [PITH_FULL_IMAGE:figures/full_fig_p030_6.png] view at source ↗

**Figure 7.** Figure 7: RELLIS static regime panels. Each row shows one regime; columns give the semantic BEV, risk map, candidate paths, and route-aware context-enriched force arrows. R1: force bends toward the feasible lower-risk detour. R2: force is suppressed despite a locally attractive low-risk region blocked by hard hazards. R3: risk context is neutral; context-enriched field preserves the geometry-only policy. 33 [PITH_F… view at source ↗

**Figure 8.** Figure 8: RELLIS-Dyn corridor opens (dynamic R1). A blocked low-risk corridor becomes feasible at tevent. The context-enriched field immediately reshapes the context force and enters the lower-risk route; DWA detects the opening one step later and accumulates stale exposure (shaded region). The bottom trace shows cumulative soft-risk along each executed trajectory; the gap between curves is the stale exposure [PITH… view at source ↗

**Figure 9.** Figure 9: RELLIS-Dyn 8-event group Pareto. Each marker is one method on one event group. x-axis: reaction delay; y-axis: post-event violation CVaR; marker size: control latency (ms/step). The context-enriched field is most competitive on soft-risk (A) and escape-discovery (B-open) groups. Reactive baselines lead on moving-obstacle (C) groups [PITH_FULL_IMAGE:figures/full_fig_p035_9.png] view at source ↗

**Figure 10.** Figure 10: RELLIS-Dyn force-channel decomposition. Mean proxy magnitudes of Fsoft and Fhard across eight event types. Soft events activate Fsoft; hard-boundary and compound events additionally activate Fhard. 35 [PITH_FULL_IMAGE:figures/full_fig_p035_10.png] view at source ↗

**Figure 11.** Figure 11: Highway trajectory panels. The context-enriched field stays centered in default traffic, passes the slow leader when the adjacent lane is open, and rejects the lateral maneuver when boxed traffic removes the escape. 37 [PITH_FULL_IMAGE:figures/full_fig_p037_11.png] view at source ↗

read the original abstract

Risk-aware navigation should be selective: a policy should expose evasive degrees of freedom only when the local scene admits a lower-risk feasible maneuver, and suppress them when no safer alternative exists. We show that adding one context-energy term to a port-Hamiltonian navigation policy produces a learned force channel with exactly this falsifiable signature. When the local risk field contains a feasible lower-risk direction, the induced context force activates toward it; when the apparent escape is blocked or not yet available, a route-aware gate suppresses lateral force rather than hallucinating an unsafe maneuver. A CVaR tail-risk objective focuses gradient updates on rare but consequential risk transitions. We validate the selectivity signature across four settings. In the primary delayed-required-escape benchmark, route-aware CVaR reduces premature force activation from 0.950 to 0.180 versus DWA while raising success from 0.480 to 0.810 with zero replans. On real off-road terrain (RELLIS-3D), route-aware enrichment achieves correct activation rate 0.837 and false activation rate 0.114, compared to 0.378/0.752 for scalar risk gradients. On static semantic maps (DFC2018), enrichment reduces catastrophic failure from 0.60 to 0.10 and oscillation by 90.7% while preserving path efficiency. In highway traffic, collisions drop from 100% to 0% when a lane escape is feasible; when no escape exists, the policy suppresses the lateral maneuver. The selectivity property follows from the gradient structure of the context energy rather than from training-time tuning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds a context-energy term to port-Hamiltonian navigation policies and trains it with CVaR to produce empirically selective force activation, but the claim that this selectivity is a strict algebraic consequence of the gradient structure rather than training effects needs explicit derivation.

read the letter

The core contribution is showing that one added context-energy term inside a port-Hamiltonian policy, optimized under a CVaR objective, yields a force channel that activates toward feasible lower-risk directions and suppresses lateral force when no escape exists. This is tested in a delayed-required-escape benchmark, RELLIS-3D off-road terrain, DFC2018 semantic maps, and highway traffic, with reported gains in success rate, reduced premature activations, lower collision rates, and fewer oscillations compared to baselines like DWA and scalar risk gradients. The experiments are concrete and cover both simulation and real data, which is useful for seeing how the idea behaves across settings. The selectivity signature is presented as falsifiable and tied to the gradient of the learned term rather than extra tuning. The soft spot is exactly the one the stress-test flags: the abstract asserts the gating follows directly from the energy gradient as an algebraic property, yet provides no derivation or proof sketch showing that the lateral component vanishes identically when no lower-risk path is available. Because the term is learned from data, it is possible the observed behavior is an optimization artifact that depends on the CVaR focus and the training distribution rather than a guaranteed structural feature. If that is the case, the property could degrade under distribution shift. The paper would be of interest to researchers working on risk-aware Hamiltonian control or safe navigation in unstructured environments. A reader looking for new ways to embed selectivity without heavy hyperparameter tuning could extract value from the empirical results and the framing. It deserves a serious referee because the experiments are multi-setting and the central idea is internally consistent, even though the structural claim requires more explicit support in the math. I would send it to review and ask the authors to supply the gradient derivation or an ablation that isolates the structural contribution from the training objective.

Referee Report

2 major / 2 minor

Summary. The paper claims that augmenting a port-Hamiltonian navigation policy with a single learned context-energy term produces a force channel whose selectivity—activating toward feasible lower-risk directions while suppressing lateral forces via a route-aware gate when no safer escape exists—follows directly from the gradient structure of that term rather than from training-time tuning or post-processing. A CVaR tail-risk objective is used to focus learning on rare risk transitions. The selectivity signature is validated empirically across four settings: a delayed-required-escape benchmark (reducing premature activation from 0.950 to 0.180 and raising success from 0.480 to 0.810), real off-road terrain (RELLIS-3D), static semantic maps (DFC2018), and highway traffic, with reported gains in success rate, reduced failures/oscillations, and collision avoidance.

Significance. If the structural guarantee holds, the approach would offer a principled mechanism for embedding falsifiable selectivity into Hamiltonian policies without ad-hoc gating, which could improve safety and reliability in risk-aware navigation. The multi-setting empirical results, including real-world terrain and traffic scenarios, provide concrete evidence of practical gains over baselines like DWA and scalar risk gradients. However, the absence of an explicit derivation tying the observed gating behavior to the energy gradient alone limits the strength of the central contribution.

major comments (2)

[Abstract; context-energy definition section] Abstract and the section defining the context-energy term: the claim that selectivity 'follows from the gradient structure of the context energy rather than from training-time tuning' is load-bearing for the central contribution, yet no algebraic derivation is supplied showing that the route-aware gate (lateral force suppression when no lower-risk escape exists) is an identity consequence of the energy gradient independent of the learned parameters, the CVaR objective, or the training distribution. Without this, the property risks being optimization-dependent rather than structural.
[Experimental validation sections] Experimental sections (delayed-required-escape benchmark and RELLIS-3D results): while numerical improvements are reported (e.g., activation rates 0.837/0.114 vs. 0.378/0.752), there is no ablation that isolates the gradient-structure contribution from the CVaR loss or data-specific fitting. This is required to substantiate that the selectivity signature survives changes in the learned term or distribution shift.

minor comments (2)

The learning algorithm and optimization details for the context-energy parameters are not described at a level that would allow reproduction of the reported force-channel behavior.
Notation for the port-Hamiltonian policy and the added context-energy term should be introduced with explicit equations early in the manuscript to clarify how the gradient is computed and gated.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important points for strengthening the central claim regarding the structural origin of selectivity in the context-energy term. We address each major comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract; context-energy definition section] Abstract and the section defining the context-energy term: the claim that selectivity 'follows from the gradient structure of the context energy rather than from training-time tuning' is load-bearing for the central contribution, yet no algebraic derivation is supplied showing that the route-aware gate (lateral force suppression when no lower-risk escape exists) is an identity consequence of the energy gradient independent of the learned parameters, the CVaR objective, or the training distribution. Without this, the property risks being optimization-dependent rather than structural.

Authors: We agree that the absence of an explicit algebraic derivation weakens the structural claim. In the revised manuscript we will insert a new subsection that derives the force components directly from the gradient of the context-energy term within the port-Hamiltonian formulation. The derivation will show that lateral suppression occurs whenever the energy gradient has no admissible component toward a blocked or unavailable lower-risk direction; this identity holds from the definition of the energy as a function of the local risk field and route constraints, without reference to specific parameter values, the CVaR objective, or the training distribution. revision: yes
Referee: [Experimental validation sections] Experimental sections (delayed-required-escape benchmark and RELLIS-3D results): while numerical improvements are reported (e.g., activation rates 0.837/0.114 vs. 0.378/0.752), there is no ablation that isolates the gradient-structure contribution from the CVaR loss or data-specific fitting. This is required to substantiate that the selectivity signature survives changes in the learned term or distribution shift.

Authors: We concur that an ablation isolating the gradient-structure effect is necessary. The revised paper will add experiments that retrain the context-energy term under alternative objectives (standard expected risk and a non-CVaR surrogate) and evaluate the resulting policies under controlled distribution shifts. These results will demonstrate that the selectivity signature (correct activation when a feasible escape exists, suppression otherwise) persists across objectives, thereby supporting that the behavior originates from the energy gradient rather than from CVaR-specific fitting. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The paper augments a port-Hamiltonian policy with one learned context-energy term whose gradient is asserted to produce the described selectivity signature (activation toward feasible lower-risk directions, suppression via route-aware gate when blocked). This is presented as a structural consequence of the energy definition and gradient, not as a statistical artifact of the CVaR training procedure. Validation occurs across four distinct settings (synthetic benchmark, real off-road terrain, semantic maps, highway traffic) with quantitative metrics on activation rates and failure modes. No equation reduces the selectivity claim to a fitted parameter by algebraic identity, no self-citation supplies a load-bearing uniqueness theorem, and the CVaR objective is used only for optimization focus rather than to define the gate behavior itself. The central claim therefore remains independent of its training inputs.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on the existence of a learnable context-energy term whose gradient produces the stated selectivity; the abstract provides no explicit list of free parameters or axioms beyond the assumed port-Hamiltonian structure and CVaR objective.

free parameters (1)

context-energy parameters
The context-energy term is learned from data, implying fitted parameters whose values are not stated.

axioms (2)

standard math Port-Hamiltonian dynamics preserve passivity and energy-based control properties
Invoked as the base policy structure to which the context term is added.
domain assumption CVaR objective focuses gradients on tail-risk transitions
Used to train the policy; assumed to produce the desired selectivity.

pith-pipeline@v0.9.0 · 5601 in / 1451 out tokens · 45019 ms · 2026-05-11T00:58:40.986656+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

164 extracted references · 33 canonical work pages · 2 internal anchors

[1]

2025 , eprint =

Yuze Wu and Zhichao Han and Xuankang Wu and Yuan Zhou and Junjie Wang and Zheng Fang and Fei Gao , title =. 2025 , eprint =

2025
[3]

2020 IEEE/RSJ international conference on intelligent robots and systems (IROS) , pages=

Learning to locomote with artificial neural-network and CPG-based control in a soft snake robot , author=. 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS) , pages=. 2020 , organization=

2020
[4]

2024 , eprint =

Kaifeng Zhang and Baoyu Li and Kris Hauser and Yunzhu Li , title =. 2024 , eprint =

2024
[5]

2025 , eprint =

Nicholas Mohammad and Nicola Bezzo , title =. 2025 , eprint =

2025
[6]

2025 , eprint =

Harsh Modi and Hao Su and Xiao Liang and Minghui Zheng , title =. 2025 , eprint =

2025
[7]

ROBOMECH Journal , volume =

Changjian Ying and Kimitoshi Yamazaki , title =. ROBOMECH Journal , volume =. 2024 , url =

2024
[8]

2025 , eprint =

Zixi Chen and Di Wu and Qinghua Guan and David Hardman and Federico Renda and Josie Hughes and Thomas George Thuruthel and Cosimo Della Santina and Barbara Mazzolai and Huichan Zhao and others , title =. 2025 , eprint =

2025
[9]

Engineering with Computers , pages=

Open-source shape optimization for isogeometric shells using FEniCS and OpenMDAO , author=. Engineering with Computers , pages=. 2025 , publisher=

2025
[10]

Proceedings of the 2020 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM) , pages =

Shuzhen Luo and Merrill Edmonds and Jingang Yi and Xianlian Zhou and Yantao Shen , title =. Proceedings of the 2020 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM) , pages =. 2020 , publisher =

2020
[11]

SIAM review , volume=

A survey of the maximum principles for optimal control problems with state constraints , author=. SIAM review , volume=. 1995 , publisher=

1995
[12]

Jin, Wanxin and Mou, Shaoshuai and Pappas, George J , journal=. Safe
[13]

ACM Transactions on Graphics , volume=

Incremental Potential Contact: Intersection- and Inversion-free, Large-Deformation Dynamics , author=. ACM Transactions on Graphics , volume=. 2020 , doi=

2020
[14]

SIAM journal on numerical analysis , volume=

Symplectic integration of constrained Hamiltonian systems by composition methods , author=. SIAM journal on numerical analysis , volume=. 1996 , publisher=

1996
[15]

ACM Transactions on Graphics (TOG) , volume=

A material point method for snow simulation , author=. ACM Transactions on Graphics (TOG) , volume=. 2013 , publisher=

2013
[16]

Seminal Graphics Papers: Pushing the Boundaries, Volume 2 , pages=

Projective dynamics: Fusing constraint projections for fast simulation , author=. Seminal Graphics Papers: Pushing the Boundaries, Volume 2 , pages=
[17]

Computer physics communications , volume=

Application of a particle-in-cell method to solid mechanics , author=. Computer physics communications , volume=. 1995 , publisher=

1995
[18]

Computer Graphics Forum , volume=

Higher-order time integration for deformable solids , author=. Computer Graphics Forum , volume=. 2020 , doi=

2020
[19]

, author=

Smooth Surface Constructions via a Higher-Order Level-Set Method. , author=. CAD/Graphics , volume=
[20]

Foundations of Computational Mathematics , volume=

Shape-aware matching of implicit surfaces based on thin shell energies , author=. Foundations of Computational Mathematics , volume=. 2018 , publisher=

2018
[21]

VMV 2013: Vision, Modeling & Visualization , editor=

A Thin Shell Approach to the Registration of Implicit Surfaces , author=. VMV 2013: Vision, Modeling & Visualization , editor=. 2013 , publisher=. doi:10.2312/PE.VMV.VMV13.089-096 , url=

work page doi:10.2312/pe.vmv.vmv13.089-096 2013
[22]

Machine Learning , volume=

Efficient Weingarten map and curvature estimation on manifolds , author=. Machine Learning , volume=. 2021 , publisher=

2021
[23]

Annals of Statistics , year =

Nonasymptotic rates for manifold, tangent space and curvature estimation , author=. Annals of Statistics , year =
[24]

2504.02763 , archivePrefix=

CanonNet: Canonical Ordering and Curvature Learning for Point Cloud Analysis , author=. 2504.02763 , archivePrefix=

work page arXiv
[25]

Geometric Flows , volume=

Discretization and approximation of surfaces using varifolds , author=. Geometric Flows , volume=. 2018 , publisher=

2018
[26]

Proceedings

Estimating curvatures and their derivatives on triangle meshes , author=. Proceedings. 2nd International Symposium on 3D Data Processing, Visualization and Transmission, 2004. 3DPVT 2004. , pages=. 2004 , organization=

2004
[27]

IEEE Transactions on Visualization and Computer Graphics , volume=

Voronoi-based curvature and feature estimation from point clouds , author=. IEEE Transactions on Visualization and Computer Graphics , volume=. 2011 , publisher=

2011
[28]

Proceedings of IEEE International Conference on Computer Vision , pages=

Estimating the tensor of curvature of a surface from a polyhedral approximation , author=. Proceedings of IEEE International Conference on Computer Vision , pages=. 1995 , organization=

1995
[29]

2504.10783 , archivePrefix=

Superfast configuration-space convex set computation on GPUs for online motion planning , author=. 2504.10783 , archivePrefix=

work page arXiv
[30]

2504.18978 , archivePrefix=

A biconvex method for minimum-time motion planning through sequences of convex sets , author=. 2504.18978 , archivePrefix=

work page arXiv
[31]

2404.15617 , archivePrefix=

DPO: A Differential and Pointwise Control Approach to Reinforcement Learning , author=. 2404.15617 , archivePrefix=

work page arXiv
[32]

2025 , eprint=

PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos , author=. 2025 , eprint=

2025
[33]

Journal of Nonlinear Science , volume=

Stochastic port-Hamiltonian systems , author=. Journal of Nonlinear Science , volume=. 2022 , publisher=

2022
[34]

Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems

Planning conditional shortest paths through an unknown environment: A framed-quadtree approach , author=. Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots , volume=. 1995 , organization=

1995
[35]

2006 IEEE/RSJ international conference on intelligent robots and systems , pages=

3d field d: Improved path planning and replanning in three dimensions , author=. 2006 IEEE/RSJ international conference on intelligent robots and systems , pages=. 2006 , organization=

2006
[36]

ESAIM: Mathematical Modelling and Numerical Analysis , volume=

Computing regularized splines in the Riemannian manifold of probability measures , author=. ESAIM: Mathematical Modelling and Numerical Analysis , volume=. 2025 , publisher=

2025
[37]

IEEE Transactions on pattern analysis and machine intelligence , number=

Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , author=. IEEE Transactions on pattern analysis and machine intelligence , number=. 1984 , publisher=

1984
[38]

, author=

Neural networks and physical systems with emergent collective computational abilities. , author=. Proceedings of the national academy of sciences , volume=
[39]

Cognitive science , volume=

A learning algorithm for Boltzmann machines , author=. Cognitive science , volume=. 1985 , publisher=

1985
[40]

Predicting Structured Data , publisher=

A Tutorial on Energy-Based Learning , author=. Predicting Structured Data , publisher=. 2006 , url=

2006
[41]

Ferreira and R.B

M.M.A. Ferreira and R.B. Vinter , abstract =. When Is the Maximum Principle for State Constrained Problems Nondegenerate? , journal =. 1994 , issn =. doi:https://doi.org/10.1006/jmaa.1994.1366 , url =

work page doi:10.1006/jmaa.1994.1366 1994
[42]

SIAM Journal on Control and Optimization , volume=

Optimal control problems with mixed and pure state constraints , author=. SIAM Journal on Control and Optimization , volume=. 2016 , publisher=

2016
[43]

IEEE Transactions on automatic control , volume=

A discrete optimal control problem , author=. IEEE Transactions on automatic control , volume=. 1966 , publisher=

1966
[44]

Journal of Machine Learning Research , year =

Antonin Raffin and Ashley Hill and Adam Gleave and Anssi Kanervisto and Maximilian Ernestus and Noah Dormann , title =. Journal of Machine Learning Research , year =
[45]

Paszke, Adam and Gross, Sam and Massa, Francisco and Lerer, Adam and Bradbury, James and Chanan, Gregory and Killeen, Trevor and Lin, Zeming and Gimelshein, Natalia and Antiga, Luca and Desmaison, Alban and Kopf, Andreas and Yang, Edward and DeVito, Zachary and Raison, Martin and Tejani, Alykhan and Chilamkurthy, Sasank and Steiner, Benoit and Fang, Lu an...
[46]

, series =

Numerical Optimization , author =. 2006 , publisher =. doi:10.1007/978-0-387-40065-5 , isbn =

work page doi:10.1007/978-0-387-40065-5 2006
[47]

Dagger diffusion navigation: Dagger boosted diffusion policy for vision-language navigation.arXiv preprint arXiv:2508.09444,

DAgger Diffusion Navigation: DAgger Boosted Diffusion Policy for Vision-Language Navigation , author=. 2508.09444 , archivePrefix=

work page arXiv
[48]

Journal of Machine Learning Research , year =

Yanwei Jia and Xun Yu Zhou , title =. Journal of Machine Learning Research , year =
[49]

2505.15544 , archivePrefix=

A Temporal Difference Method for Stochastic Continuous Dynamics , author=. 2505.15544 , archivePrefix=

work page arXiv
[50]

2017 IEEE/RSJ international conference on intelligent robots and systems (IROS) , pages=

Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation , author=. 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS) , pages=. 2017 , organization=

2017
[51]

IEEE International Conference on Robotics and Automation (ICRA) , year =

Curiosity-driven Exploration for Mapless Navigation with Deep Reinforcement Learning , author =. IEEE International Conference on Robotics and Automation (ICRA) , year =
[52]

2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages=

Terrapn: Unstructured terrain navigation using online self-supervised learning , author=. 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages=. 2022 , organization=

2022
[53]

1997 , publisher=

The dynamic window approach to collision avoidance , author=. 1997 , publisher=

1997
[54]

SIAM Journal on Control and Optimization , volume=

Interior point methods in optimal control problems of affine systems: Convergence results and solving algorithms , author=. SIAM Journal on Control and Optimization , volume=. 2023 , publisher=

2023
[55]

Workshop on the Theory of AI for Scientific Computing , year=

Stochastic Differential Policy Optimization: A Rough Path Approach to Reinforcement Learning , author=. Workshop on the Theory of AI for Scientific Computing , year=
[56]

Middle East Oil, Gas and Geosciences Show (MEOS GEO) , series =

Bayesian Port–Hamiltonian Surrogate for Three-Phase Reservoir Flow Simulation , author =. Middle East Oil, Gas and Geosciences Show (MEOS GEO) , series =. 2025 , month =. doi:10.2118/227802-MS , url =

work page doi:10.2118/227802-ms 2025
[57]

2025 , eprint=

Learning Generalized Hamiltonian Dynamics with Stability from Noisy Trajectory Data , author=. 2025 , eprint=

2025
[58]

Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS) , year =

A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , author =. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS) , year =
[59]

Neural Computation , volume =

Efficient Training of Artificial Neural Networks for Autonomous Navigation , author =. Neural Computation , volume =
[60]

Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence , year =

Maximum Entropy Inverse Reinforcement Learning , author =. Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence , year =
[61]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Deep Reinforcement Learning from Human Preferences , author =. Advances in Neural Information Processing Systems (NeurIPS) , year =
[62]

The International Journal of Robotics Research , volume =

Sampling-Based Algorithms for Optimal Motion Planning , author =. The International Journal of Robotics Research , volume =
[63]

IEEE International Conference on Robotics and Automation (ICRA) , year =

Information Theoretic MPC for Model-Based Reinforcement Learning , author =. IEEE International Conference on Robotics and Automation (ICRA) , year =
[64]

IEEE Transactions on Systems Science and Cybernetics , volume =

A Formal Basis for the Heuristic Determination of Minimum Cost Paths , author =. IEEE Transactions on Systems Science and Cybernetics , volume =. 1968 , doi =

1968
[65]

Ichter, Brian and Brohan, Anthony and Chebotar, Yevgen and Finn, Chelsea and Hausman, Karol and Herzog, Alexander and Ho, Daniel and Ibarz, Julian and Irpan, Alex and Jang, Eric and Julian, Ryan and Kalashnikov, Dmitry and Levine, Sergey and Lu, Yao and Parada, Carolina and Rao, Kanishka and Sermanet, Pierre and Toshev, Alexander T. and Vanhoucke, Vincent...

2023
[66]

IEEE International Conference on Robotics and Automation (ICRA) , year =

Code as Policies: Language Model Programs for Embodied Control , author =. IEEE International Conference on Robotics and Automation (ICRA) , year =
[67]

Proceedings of the 34th International Conference on Machine Learning (ICML) , year =

Constrained Policy Optimization , author =. Proceedings of the 34th International Conference on Machine Learning (ICML) , year =
[68]

Risk-Sensitive and Robust Decision-Making: A

Chow, Yinlam and Tamar, Aviv and Mannor, Shie and Pavone, Marco , booktitle =. Risk-Sensitive and Robust Decision-Making: A. 2015 , volume =

2015
[69]

Physical Review E , volume =

Port-Hamiltonian Neural Networks for Learning Explicit Time-Dependent Dynamical Systems , author =. Physical Review E , volume =. 2021 , doi =

2021
[70]

Stable port-hamiltonian neural networks,

Roth, Nils and Klein, Dominik K. and Kannapinn, Maximilian and Peters, Jan and Weeger, Oliver , year =. Stable Port-. 2502.02480 , archivePrefix =

work page arXiv
[71]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Hamiltonian Neural Networks , author =. Advances in Neural Information Processing Systems (NeurIPS) , year =
[72]

Control Barrier Function Based Quadratic Programs for Safety Critical Systems , author =
[73]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Experience Replay for Continual Learning , author =. Advances in Neural Information Processing Systems (NeurIPS) , year =
[74]

Proceedings of the 34th International Conference on Machine Learning (ICML) , year =

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , author =. Proceedings of the 34th International Conference on Machine Learning (ICML) , year =
[75]

2017 , eprint=

Proximal Policy Optimization Algorithms , author=. 2017 , eprint=

2017
[76]

2018 , eprint=

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , author=. 2018 , eprint=

2018
[77]

2017 , eprint=

FeUdal Networks for Hierarchical Reinforcement Learning , author=. 2017 , eprint=

2017
[78]

Wijmans, Erik and Kadian, Abhishek and Morcos, Ari and Lee, Stefan and Essa, Irfan and Parikh, Devi and Savva, Manolis and Batra, Dhruv , booktitle =
[79]

IEEE International Conference on Robotics and Automation (ICRA) , year =

Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping , author =. IEEE International Conference on Robotics and Automation (ICRA) , year =
[80]

and Ayanian, Nora and Sukhatme, Gaurav S

Wigness, Maggie and Eum, Sungmin and Rogers, John G. and Han, David and Kwon, Heesung , booktitle =. A. 2019 , pages =. doi:10.1109/IROS40897.2019.8968283 , url =

work page doi:10.1109/iros40897.2019.8968283 2019
[81]

Jiang, Peng and Osteen, Philip and Wigness, Maggie and Saripalli, Srikanth , eprint =

Showing first 80 references.