pith. machine review for the scientific record. sign in

arxiv: 2604.18343 · v1 · submitted 2026-04-20 · 💻 cs.RO · cs.SY· eess.SY

Recognition: unknown

DAG-STL: A Hierarchical Framework for Zero-Shot Trajectory Planning under Signal Temporal Logic Specifications

Ancheng Hou, Ruijia Liu, Xiang Yin, Xiao Yu

Authors on Pith no claims yet

Pith reviewed 2026-05-10 03:49 UTC · model grok-4.3

classification 💻 cs.RO cs.SYeess.SY
keywords Signal Temporal LogicZero-shot planningTrajectory planningDiffusion modelsHierarchical frameworksRobotic navigationOffline learningTemporal specifications
0
0 comments X

The pith

A hierarchical framework decomposes STL specifications into reachability and invariance conditions to plan trajectories without knowing robot dynamics.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to plan robot trajectories that satisfy complex timed tasks expressed in Signal Temporal Logic, even when the robot's movement rules are unknown and no task-specific training data exists. It does this by first breaking the logic into simpler reachability and invariance goals connected by timing links, then using learned estimates to pick waypoints, and finally generating the paths between them with a diffusion model. This separation lets the system handle long sequences better than trying to generate the whole path at once while satisfying the logic. A sympathetic reader would care because it opens the door to programming robots with high-level instructions in unknown or changing environments using only generic movement data.

Core claim

DAG-STL converts long-horizon STL planning into a decomposition-allocation-generation pipeline. An STL formula is decomposed into reachability and invariance progress conditions linked by shared timing constraints. Timed waypoints are allocated using learned reachability-time estimates from task-agnostic data. Trajectories between waypoints are synthesized by a diffusion-based generator. Additional mechanisms like a rollout-free dynamic consistency metric, anytime refinement search, and hierarchical online replanning ensure feasibility.

What carries the argument

The decomposition of STL formulas into reachability and invariance progress conditions linked by shared timing constraints, followed by waypoint allocation and diffusion generation.

If this is right

  • Global planning reduces to shorter, better-supported subproblems that the diffusion generator can handle.
  • Substantially better performance than direct robustness-guided diffusion on complex long-horizon STL tasks.
  • Generalization across navigation tasks like Maze2D and AntMaze and manipulation in the Cube domain.
  • Recovery of most tasks solvable by optimization with explicit models, but with lower computation time.
  • Support for execution-time recovery through online replanning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach suggests that logical decomposition can make offline learned generators more reliable for temporal constraints without retraining per task.
  • If the timing links hold, this method could scale to even longer horizons by recursing the hierarchy.
  • Extensions might include combining with model predictive control for tighter dynamic consistency.
  • Testable by applying to new STL formulas not seen in training data for the estimator.

Load-bearing premise

The decomposition of an STL formula into reachability and invariance progress conditions linked by shared timing constraints remains valid and sufficient for planning when system dynamics are completely unknown and only task-agnostic trajectory data is available.

What would settle it

An experiment in the OGBench AntMaze domain where DAG-STL fails to produce any valid trajectory for a long-horizon STL specification involving multiple timed reachability and invariance phases, while a direct diffusion method guided by robustness succeeds in finding a satisfying trajectory.

Figures

Figures reproduced from arXiv: 2604.18343 by Ancheng Hou, Ruijia Liu, Xiang Yin, Xiao Yu.

Figure 1
Figure 1. Figure 1: The overall framework of DAG-STL. Importantly, although the dynamics model and the environment geometry are not explicitly available, the task predicates that appear in the STL specification are assumed to be queryable. That is, for each atomic predicate µ, its evaluation function hµ(x) is given, so that whether a state satisfies µ can be determined. In this setting, satisfying the STL formula alone is not… view at source ↗
Figure 2
Figure 2. Figure 2: Decomposition process of the STL formula in (14). 4) If φ ′ = ϕU[a,b]φ, then we: (i) introduce a new time variable λi , (ii) add the constraint λi ∈ [a, b], (iii) shift every progress condition in Pφ by λi , and (iv) extend every invariance progress condition in Pϕ up to the chosen until time. Formally, Λφ′ = Λφ ⊎ {λi}, Tφ′ = Tφ ⊎ {λi ∈ [a, b]}, Pφ′ = {P(cΛ + λi , dΛ + λi , µ) | P(cΛ, dΛ, µ) ∈ Pφ} ∪ {I(c, … view at source ↗
Figure 3
Figure 3. Figure 3: Visualization of the environments used in our experiments. From left to right and top to bottom: the three Maze2D environments (Umaze, Medium, Large), the AntMaze environment, the Cube environment and the custom-built environment. environment, planning is performed directly in the low￾dimensional state space, and execution is carried out by a PD controller with k = 1. In Cube and AntMaze, by contrast, plan… view at source ↗
Figure 4
Figure 4. Figure 4: Planned trajectory (left) and executed trajectory (right) for the sequential visit task (29). The numbers next to the start point and target regions indicate the satisfaction times assigned by the progress allocation module on the planning time scale. The plotted trajectories are shown at the finer trajectory resolution (η = 8) used for generation and execution; see Section 7.1.3 for details. before enteri… view at source ↗
Figure 5
Figure 5. Figure 5: Planned trajectory (left) and executed trajectory (right) for the hybrid task (30). The numbers next to the start point and target regions indicate the satisfaction times assigned by the progress allocation module on the planning time scale. The plotted trajectories are shown at the finer trajectory resolution (η = 8) used for generation and execution. incompatibility with DAG-STL, but the fact that, after… view at source ↗
Figure 6
Figure 6. Figure 6: visualizes the two cases. The first two rows show the timed waypoint allocations on the planning horizon for the two tasks, illustrating how the allocation module schedules repeated and nested visitation requirements over time. The third row shows the corresponding executed trajectories. Taken together, these two cases show that nontrivial STL tasks with outer-global eventually obligations can be successfu… view at source ↗
Figure 8
Figure 8. Figure 8: Example multi-stage manipulation tasks in the Cube environment. Solid cubes indicate the initial configuration and translucent cubes indicate the target configuration. A single-cube manipulation task can then be written as φi = FIi [PITH_FULL_IMAGE:figures/full_fig_p023_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Representative manipulation cases in the Cube environment. Each column corresponds to one task instance. From top to bottom, the rows show the initial configuration, the final configuration, and the planned (blue)/executed (yellow) end-effector trajectories, respectively. Representative Examples [PITH_FULL_IMAGE:figures/full_fig_p024_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Comparison of average planning and execution success rates across Maze2D environments. In each panel, the x-axis lists the five methods: B(Heuristic), B(lmin), B(lnorm), B(lmax), and ARS, and the y-axis reports success rate (%). For each environment-method pair, planning and execution success rates are averaged over the nine task templates. The two values are shown as horizontal markers connected by a ver… view at source ↗
Figure 11
Figure 11. Figure 11: Effectiveness of DCM in Maze2D-Large. Upper panel: execution success rate versus DCM rank among 10 sampled candidates from the same case. Lower panel: execution success rate across global DCM-score deciles. start state, one circular goal region, and two circular avoidance regions. To match the actual use of DCM as closely as possible while isolating the trajectory-generation￾to-execution pipeline from the… view at source ↗
Figure 12
Figure 12. Figure 12: Representative sampled layouts for the combinatorial stress test in the custom-built environment, shown here for n = 4 and n = 6. Under this obstacle-centered layout, a natural feasible strategy is to start from the initial state and visit the regions in either clockwise or counterclockwise order, while the STL task itself still leaves the order unspecified. Under such layouts, completing the task within … view at source ↗
Figure 13
Figure 13. Figure 13: Results on the combinatorial stress test in the custom-built environment. Each column fixes a slack ratio ρ and varies the number of unordered goals n. The upper panel reports SR0, and the lower panel reports planning time. finding a good allocation under limited search budget rather than arbitrary task infeasibility or severe local distribution mismatch. Compared Variants and Evaluation On the accepted t… view at source ↗
read the original abstract

Signal Temporal Logic (STL) is a powerful language for specifying temporally structured robotic tasks. Planning executable trajectories under STL constraints remains difficult when system dynamics and environment structure are not analytically available. Existing methods typically either assume explicit models or learn task-specific behaviors, limiting zero-shot generalization to unseen STL tasks. In this work, we study offline STL planning under unknown dynamics using only task-agnostic trajectory data. Our central design philosophy is to separate logical reasoning from trajectory realization. We instantiate this idea in DAG-STL, a hierarchical framework that converts long-horizon STL planning into three stages. It first decomposes an STL formula into reachability and invariance progress conditions linked by shared timing constraints. It then allocates timed waypoints using learned reachability-time estimates. Finally, it synthesizes trajectories between these waypoints with a diffusion-based generator. This decomposition--allocation--generation pipeline reduces global planning to shorter, better-supported subproblems. To bridge the gap between planning-level correctness and execution-level feasibility, we further introduce a rollout-free dynamic consistency metric, an anytime refinement search procedure for improving multiple allocation hypotheses under finite budgets, and a hierarchical online replanning mechanism for execution-time recovery. Experiments in Maze2D, OGBench AntMaze, and the Cube domain show that DAG-STL substantially outperforms direct robustness-guided diffusion on complex long-horizon STL tasks and generalizes across navigation and manipulation settings. In a custom environment with an optimization-based reference, DAG-STL recovers most model-solvable tasks while retaining a clear computational advantage over direct optimization based on the explicit system model.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper claims to introduce DAG-STL, a hierarchical framework for zero-shot trajectory planning under STL specifications with unknown dynamics using only task-agnostic data. The framework decomposes STL formulas into reachability and invariance progress conditions linked by shared timing constraints, allocates timed waypoints using learned reachability-time estimates, and synthesizes trajectories with a diffusion-based generator. It includes a rollout-free dynamic consistency metric, anytime refinement search, and hierarchical online replanning. Experiments in Maze2D, OGBench AntMaze, and Cube domain show substantial outperformance over direct robustness-guided diffusion on complex long-horizon tasks and generalization across navigation and manipulation, while recovering most model-solvable tasks with computational advantage.

Significance. If the decomposition preserves STL semantics under approximate learned timing estimates, this would be a significant contribution to data-driven robotic planning, allowing zero-shot generalization to unseen STL tasks without requiring explicit models or task-specific learning. The separation of logical reasoning from trajectory realization, combined with mechanisms for dynamic consistency and replanning, addresses key challenges in long-horizon planning. The empirical results suggest practical advantages over both diffusion and optimization baselines.

major comments (1)
  1. [§3 (Decomposition-Allocation-Generation Pipeline)] The claim that the decomposition of an arbitrary STL formula into reachability and invariance sub-conditions linked by shared timing constraints, combined with learned reachability-time estimates from task-agnostic trajectories, supports zero-shot planning under unknown dynamics lacks a formal argument that the resulting allocations preserve the semantics of the original formula. Errors in timing estimates can propagate through the shared constraints, potentially rendering trajectories invalid despite local diffusion success. This is load-bearing for the central zero-shot claim, as no model-based fallback is provided.
minor comments (2)
  1. [Abstract] The abstract asserts outperformance and generalization but does not provide quantitative metrics, error bars, or ablation details, making it challenging to fully evaluate the empirical support without the full manuscript.
  2. [Experiments] The description of the custom environment with optimization-based reference could benefit from more details on how the comparison is conducted to ensure fair assessment of computational advantage.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review and for recognizing the potential significance of separating logical reasoning from trajectory realization in zero-shot STL planning. We address the major comment on the formal aspects of the decomposition below.

read point-by-point responses
  1. Referee: [§3 (Decomposition-Allocation-Generation Pipeline)] The claim that the decomposition of an arbitrary STL formula into reachability and invariance sub-conditions linked by shared timing constraints, combined with learned reachability-time estimates from task-agnostic trajectories, supports zero-shot planning under unknown dynamics lacks a formal argument that the resulting allocations preserve the semantics of the original formula. Errors in timing estimates can propagate through the shared constraints, potentially rendering trajectories invalid despite local diffusion success. This is load-bearing for the central zero-shot claim, as no model-based fallback is provided.

    Authors: We agree that the manuscript does not contain a formal proof establishing semantic preservation of the original STL formula under approximate learned timing estimates. The decomposition into reachability and invariance progress conditions with shared timing constraints follows standard STL decomposition principles that preserve semantics exactly when timing values are precise. The learned reachability-time estimates, obtained from task-agnostic trajectories, necessarily introduce approximation under unknown dynamics. To mitigate error propagation through the shared constraints, the framework includes a rollout-free dynamic consistency metric that evaluates allocation feasibility without full simulation, an anytime refinement search that improves allocation hypotheses under finite compute budgets, and hierarchical online replanning that enables execution-time recovery. These mechanisms are intended to maintain practical correctness even when local timing estimates are imperfect. While these safeguards do not constitute formal guarantees, the experiments show that DAG-STL solves substantially more complex long-horizon tasks than direct robustness-guided diffusion and recovers most tasks solvable by an explicit-model optimizer. We will revise the manuscript to (i) clarify in §3 that the decomposition preserves semantics exactly only under precise timings and that learned estimates are approximations, (ii) expand the discussion of how the consistency metric, refinement search, and replanning address propagation risks, and (iii) explicitly acknowledge the absence of a model-based fallback and formal semantic guarantees as current limitations, with suggested directions for future theoretical work. revision: partial

Circularity Check

0 steps flagged

No significant circularity in DAG-STL decomposition-allocation-generation pipeline

full rationale

The paper presents a methodological framework that decomposes STL formulas into reachability and invariance progress conditions as a design choice, learns reachability-time estimates from task-agnostic trajectory data, allocates waypoints, and synthesizes trajectories via diffusion. No equations or derivations are provided that reduce any prediction or result to its inputs by construction, no self-citations serve as load-bearing uniqueness theorems, and no ansatz is smuggled via prior work. The central claims rest on empirical validation across Maze2D, AntMaze, and Cube domains rather than tautological reductions, rendering the approach self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities are identifiable. The learned reachability-time estimates are implied to be data-driven but their fitting details are unknown.

pith-pipeline@v0.9.0 · 5594 in / 1042 out tokens · 44580 ms · 2026-05-10T03:49:26.510296+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

102 extracted references · 13 canonical work pages · 4 internal anchors

  1. [1]

    2005 , publisher=

    Algorithmic learning in a random world , author=. 2005 , publisher=

  2. [2]

    IEEE/RSJ International Conference on Intelligent Robots and Systems , pages=

    Multi-agent reinforcement learning guided by signal temporal logic specifications , author=. IEEE/RSJ International Conference on Intelligent Robots and Systems , pages=

  3. [3]

    International conference on machine learning , pages=

    Out-of-distribution detection with deep nearest neighbors , author=. International conference on machine learning , pages=. 2022 , organization=

  4. [4]

    Proceedings of the 2000 ACM SIGMOD international conference on Management of data , pages=

    LOF: identifying density-based local outliers , author=. Proceedings of the 2000 ACM SIGMOD international conference on Management of data , pages=

  5. [5]

    Proceedings of the 2000 ACM SIGMOD international conference on Management of data , pages=

    Efficient algorithms for mining outliers from large data sets , author=. Proceedings of the 2000 ACM SIGMOD international conference on Management of data , pages=

  6. [6]

    SafeDiffuser: Safe Planning with Diffusion Probabilistic Models , booktitle =

    Wei Xiao and Tsun. SafeDiffuser: Safe Planning with Diffusion Probabilistic Models , booktitle =

  7. [7]

    arXiv preprint arXiv:2011.04950 , year=

    Model-based reinforcement learning from signal temporal logic specifications , author=. arXiv preprint arXiv:2011.04950 , year=

  8. [8]

    Annual Review of Control, Robotics, and Autonomous Systems , volume=

    Formal methods for control synthesis: An optimization perspective , author=. Annual Review of Control, Robotics, and Autonomous Systems , volume=. 2019 , publisher=

  9. [9]

    Control Engineering Practice , volume=

    Signal temporal logic synthesis under model predictive control: A low complexity approach , author=. Control Engineering Practice , volume=. 2024 , publisher=

  10. [10]

    IEEE Conference on Control Technology and Applications , pages=

    Smooth operator: Control using the smooth robustness of temporal logic , author=. IEEE Conference on Control Technology and Applications , pages=. 2017 , organization=

  11. [11]

    Nonlinear Analysis: Hybrid Systems , volume=

    STL and wSTL control synthesis: A disjunction-centric mixed-integer linear programming approach , author=. Nonlinear Analysis: Hybrid Systems , volume=. 2025 , publisher=

  12. [12]

    IEEE Transactions on Robotics , volume=

    Soft robots modeling: A structured overview , author=. IEEE Transactions on Robotics , volume=. 2023 , publisher=

  13. [13]

    Learning for dynamics and control , pages=

    Tractable reinforcement learning of signal temporal logic objectives , author=. Learning for dynamics and control , pages=. 2020 , organization=

  14. [14]

    The International Journal of Robotics Research , volume=

    Kinematic issues in 6R cuspidal robots, guidelines for path planning and deciding cuspidality , author=. The International Journal of Robotics Research , volume=. 2025 , publisher=

  15. [15]

    IEEE Control Systems Letters , volume=

    Trajectory optimization for high-dimensional nonlinear systems under STL specifications , author=. IEEE Control Systems Letters , volume=. 2020 , publisher=

  16. [16]

    Annual Review of Control, Robotics, and Autonomous Systems , volume=

    Synthesis for robots: Guarantees and feedback for robot behavior , author=. Annual Review of Control, Robotics, and Autonomous Systems , volume=. 2018 , publisher=

  17. [17]

    2006 , publisher=

    Planning algorithms , author=. 2006 , publisher=

  18. [18]

    Conference on Robot Learning , pages=

    Learning from demonstrations using signal temporal logic , author=. Conference on Robot Learning , pages=

  19. [19]

    IEEE Robotics and Automation Letters , volume=

    Cooperative object manipulation under signal temporal logic tasks and uncertain dynamics , author=. IEEE Robotics and Automation Letters , volume=. 2022 , publisher=

  20. [20]

    IEEE Transactions on Robotics , year=

    Robust-locomotion-by-logic: Perturbation-resilient bipedal locomotion via signal temporal logic guided model predictive control , author=. IEEE Transactions on Robotics , year=

  21. [21]

    7th Annual Learning for Dynamics & Control Conference , pages=

    STLGame: Signal Temporal Logic Games in Adversarial Multi-Agent Systems , author=. 7th Annual Learning for Dynamics & Control Conference , pages=. 2025 , organization=

  22. [22]

    Annual Reviews in Control , volume=

    Formal synthesis of controllers for safety-critical autonomous systems: Developments and challenges , author=. Annual Reviews in Control , volume=. 2024 , publisher=

  23. [23]

    Kapoor, Parv and Mizuta, Kazuki and Kang, Eunsuk and Leung, Karen , journal=

  24. [24]

    IEEE Robotics and Automation Letters , volume=

    Power line inspection tasks with multi-aerial robot systems via signal temporal logic specifications , author=. IEEE Robotics and Automation Letters , volume=. 2021 , publisher=

  25. [25]

    ACM Computing Surveys , volume=

    Formal specification and verification of autonomous robotic systems: A survey , author=. ACM Computing Surveys , volume=. 2019 , publisher=

  26. [26]

    International Conference on Learning Representations , year=

    OGBench: Benchmarking Offline Goal-Conditioned RL , author=. International Conference on Learning Representations , year=

  27. [27]

    arXiv preprint arXiv:2408.08252 , year =

    Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding , author=. arXiv preprint arXiv:2408.08252 , year=

  28. [28]

    IEEE Transactions on Robotics , volume=

    Continuous-time control synthesis under nested signal temporal logic specifications , author=. IEEE Transactions on Robotics , volume=. 2024 , publisher=

  29. [29]

    European Conference on Computer Vision , pages=

    Diffusion Models as Optimizers for Efficient Planning in Offline RL , author=. European Conference on Computer Vision , pages=. 2025 , organization=

  30. [30]

    Lectures on Runtime Verification: Introductory and Advanced Topics , pages=

    Specification-based monitoring of cyber-physical systems: a survey on theory, tools and applications , author=. Lectures on Runtime Verification: Introductory and Advanced Topics , pages=. 2018 , publisher=

  31. [31]

    NASA Formal Methods Symposium , pages=

    Safe Planning Through Incremental Decomposition of Signal Temporal Logic Specifications , author=. NASA Formal Methods Symposium , pages=. 2024 , organization=

  32. [32]

    D4RL: Datasets for Deep Data-Driven Reinforcement Learning

    D4rl: Datasets for deep data-driven reinforcement learning , author=. arXiv preprint arXiv:2004.07219 , year=

  33. [33]

    2024 IEEE/RSJ International Conference on Intelligent Robots and Systems , pages=

    Cobl-diffusion: Diffusion-based conditional robot planning in dynamic environments using control barrier and lyapunov functions , author=. 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems , pages=. 2024 , organization=

  34. [34]

    Advances in Neural Information Processing Systems , volume=

    Constrained synthesis with projected diffusion models , author=. Advances in Neural Information Processing Systems , volume=

  35. [35]

    NASA Formal Methods Symposium , pages=

    Rewrite-based decomposition of signal temporal logic specifications , author=. NASA Formal Methods Symposium , pages=. 2023 , organization=

  36. [36]

    2023 62nd IEEE Conference on Decision and Control , pages=

    Model predictive control for signal temporal logic specifications with time interval decomposition , author=. 2023 62nd IEEE Conference on Decision and Control , pages=. 2023 , organization=

  37. [37]

    International conference on machine learning , pages=

    Deep unsupervised learning using nonequilibrium thermodynamics , author=. International conference on machine learning , pages=. 2015 , organization=

  38. [38]

    Advances in neural information processing systems , volume=

    Diffusion models beat gans on image synthesis , author=. Advances in neural information processing systems , volume=

  39. [39]

    Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review

    Reinforcement learning and control as probabilistic inference: Tutorial and review , author=. arXiv preprint arXiv:1805.00909 , year=

  40. [40]

    Proceedings of the 26th annual international conference on machine learning , pages=

    Robot trajectory optimization using approximate inference , author=. Proceedings of the 26th annual international conference on machine learning , pages=

  41. [41]

    International workshop on artificial intelligence and statistics , pages=

    Planning by probabilistic inference , author=. International workshop on artificial intelligence and statistics , pages=. 2003 , organization=

  42. [42]

    International Symposium on Formal Techniques in Real-Time and Fault-Tolerant Systems , pages=

    Monitoring temporal properties of continuous signals , author=. International Symposium on Formal Techniques in Real-Time and Fault-Tolerant Systems , pages=. 2004 , organization=

  43. [43]

    53rd Annual Allerton Conference on Communication, Control, and Computing , pages=

    Robust temporal logic model predictive control , author=. 53rd Annual Allerton Conference on Communication, Control, and Computing , pages=. 2015 , organization=

  44. [44]

    International Conference on Machine Learning , pages=

    Planning with Diffusion for Flexible Behavior Synthesis , author=. International Conference on Machine Learning , pages=. 2022 , organization=

  45. [45]

    Proceedings of the 18th International Conference on Hybrid Systems: Computation and Control , pages=

    Enforcing temporal logic specifications via reinforcement learning , author=. Proceedings of the 18th International Conference on Hybrid Systems: Computation and Control , pages=

  46. [46]

    Proceedings of the 4th ACM SIGBED International Workshop on Design, Modeling, and Evaluation of Cyber-Physical Systems , pages=

    Model predictive control from signal temporal logic specifications: A case study , author=. Proceedings of the 4th ACM SIGBED International Workshop on Design, Modeling, and Evaluation of Cyber-Physical Systems , pages=

  47. [47]

    IEEE Robotics and Automation Letters , volume=

    Multi-agent motion planning from signal temporal logic specifications , author=. IEEE Robotics and Automation Letters , volume=. 2022 , publisher=

  48. [48]

    IEEE Control Systems Letters , volume=

    Mixed-integer programming for signal temporal logic with fewer binary variables , author=. IEEE Control Systems Letters , volume=. 2022 , publisher=

  49. [49]

    IEEE Robotics and Automation Letters , volume=

    Funnel-based reward shaping for signal temporal logic tasks in reinforcement learning , author=. IEEE Robotics and Automation Letters , volume=. 2023 , publisher=

  50. [50]

    IEEE Control Systems Letters , year=

    Tractable Reinforcement Learning for Signal Temporal Logic Tasks With Counterfactual Experience Replay , author=. IEEE Control Systems Letters , year=

  51. [51]

    IEEE Robotics and Automation Letters , year=

    Signal temporal logic neural predictive control , author=. IEEE Robotics and Automation Letters , year=

  52. [52]

    arXiv preprint arXiv:2408.01923 , year=

    Scalable Signal Temporal Logic Guided Reinforcement Learning via Value Function Space Optimization , author=. arXiv preprint arXiv:2408.01923 , year=

  53. [53]

    International Conference on Software Engineering and Formal Methods , pages=

    Training agents to satisfy timed and untimed signal temporal logic specifications with reinforcement learning , author=. International Conference on Software Engineering and Formal Methods , pages=. 2022 , organization=

  54. [54]

    IEEE International Conference on Robotics and Automation , pages=

    Synthesis of temporally-robust policies for signal temporal logic tasks using reinforcement learning , author=. IEEE International Conference on Robotics and Automation , pages=. 2024 , organization=

  55. [55]

    60th IEEE Conference on Decision and Control , pages=

    Model-free reinforcement learning for optimal control of Markov decision processes under signal temporal logic specifications , author=. 60th IEEE Conference on Decision and Control , pages=. 2021 , organization=

  56. [56]

    IEEE International Conference on Robotics and Automation , pages=

    Guided conditional diffusion for controllable traffic simulation , author=. IEEE International Conference on Robotics and Automation , pages=. 2023 , organization=

  57. [57]

    IEEE Control Systems Letters , volume=

    A smooth robustness measure of signal temporal logic for symbolic control , author=. IEEE Control Systems Letters , volume=. 2020 , publisher=

  58. [58]

    The Eleventh International Conference on Learning Representations , year=

    Is Conditional Generative Modeling all you need for Decision Making? , author=. The Eleventh International Conference on Learning Representations , year=

  59. [59]

    IEEE Access , volume=

    Deep reinforcement learning under signal temporal logic constraints using Lagrangian relaxation , author=. IEEE Access , volume=. 2022 , publisher=

  60. [60]

    The International Journal of Robotics Research , pages=

    Diffusion policy: Visuomotor policy learning via action diffusion , author=. The International Journal of Robotics Research , pages=. 2023 , publisher=

  61. [61]

    Proceedings of the 30th International Conference on Neural Information Processing Systems , pages=

    Learning to poke by poking: experiential learning of intuitive physics , author=. Proceedings of the 30th International Conference on Neural Information Processing Systems , pages=

  62. [62]

    The International Journal of Robotics Research , volume=

    Backpropagation through signal temporal logic specifications: Infusing logical structure into gradient-based methods , author=. The International Journal of Robotics Research , volume=. 2023 , publisher=

  63. [63]

    IEEE Control Systems Letters , year=

    Decomposition-Based MPC for Uncertain Systems With Nested Signal Temporal Logic Specifications , author=. IEEE Control Systems Letters , year=

  64. [64]

    Proceedings of the 22nd ACM International Conference on Hybrid Systems: Computation and Control , pages=

    Structured reward functions using STL , author=. Proceedings of the 22nd ACM International Conference on Hybrid Systems: Computation and Control , pages=

  65. [65]

    IEEE Control Systems Letters , volume=

    Signal temporal logic task decomposition via convex optimization , author=. IEEE Control Systems Letters , volume=. 2021 , publisher=

  66. [66]

    Proceedings of the 26th ACM International Conference on Hybrid Systems: Computation and Control , pages=

    Sampling-based Approach to Robust STL Synthesis for Complex Systems under Uncertainty , author=. Proceedings of the 26th ACM International Conference on Hybrid Systems: Computation and Control , pages=

  67. [67]

    IEEE 55th Conference on Decision and Control , pages=

    Q-learning for robust satisfaction of signal temporal logic specifications , author=. IEEE 55th Conference on Decision and Control , pages=. 2016 , organization=

  68. [68]

    IEEE International Conference on Robotics and Automation , pages=

    Stochastic robustness interval for motion planning with signal temporal logic , author=. IEEE International Conference on Robotics and Automation , pages=. 2023 , organization=

  69. [69]

    IEEE/RSJ International Conference on Intelligent Robots and Systems , pages=

    Robust counterexample-guided optimization for planning from differentiable temporal logic , author=. IEEE/RSJ International Conference on Intelligent Robots and Systems , pages=. 2022 , organization=

  70. [70]

    IEEE/RSJ International Conference on Intelligent Robots and Systems , pages=

    Adaptive planning with generative models under uncertainty , author=. IEEE/RSJ International Conference on Intelligent Robots and Systems , pages=. 2024 , organization=

  71. [71]

    Denoising Diffusion Implicit Models

    Denoising diffusion implicit models , author=. arXiv preprint arXiv:2010.02502 , year=

  72. [72]

    2024 , booktitle =

    Li, Guanghe and Shan, Yixiang and Zhu, Zhengbang and Long, Ting and Zhang, Weinan , title =. 2024 , booktitle =

  73. [73]

    Constrained

    Constrained Diffusers for Safe Planning and Control , author=. arXiv preprint arXiv:2506.12544 , year=

  74. [74]

    Journal of risk , volume=

    Optimization of conditional value-at-risk , author=. Journal of risk , volume=

  75. [75]

    IEEE transactions on information theory , volume=

    Nearest neighbor pattern classification , author=. IEEE transactions on information theory , volume=. 1967 , publisher=

  76. [76]

    The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

    Generative Trajectory Stitching through Diffusion Composition , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

  77. [77]

    IEEE Robotics and Automation Letters , volume=

    Diverse controllable diffusion policy with signal temporal logic , author=. IEEE Robotics and Automation Letters , volume=. 2024 , publisher=

  78. [78]

    IEEE Robotics and Automation Letters , year=

    LTLDoG: Satisfying temporally-extended symbolic constraints for safe diffusion-based planning , author=. IEEE Robotics and Automation Letters , year=

  79. [79]

    2025 IEEE International Conference on Robotics and Automation , pages=

    Diffusion meets options: Hierarchical generative skill composition for temporally-extended tasks , author=. 2025 IEEE International Conference on Robotics and Automation , pages=. 2025 , organization=

  80. [80]

    Tackling the generative learning trilemma with denoising diffusion gans.arXiv preprint arXiv:2112.07804, 2021

    Tackling the generative learning trilemma with denoising diffusion gans , author=. arXiv preprint arXiv:2112.07804 , year=

Showing first 80 references.