pith. machine review for the scientific record. sign in

arxiv: 2605.12651 · v2 · submitted 2026-05-12 · 💻 cs.LG

Recognition: 2 theorem links

· Lean Theorem

Runtime Monitoring of Perception-Based Autonomous Systems via Embedding Temporal Logic

Authors on Pith no claims yet

Pith reviewed 2026-05-15 05:00 UTC · model grok-4.3

classification 💻 cs.LG
keywords runtime monitoringtemporal logicembedding spacesperception-based systemsautonomous systemsconformal calibrationmanipulation environments
0
0 comments X

The pith

Embedding Temporal Logic monitors autonomous systems by defining predicates on distances between observed and reference embeddings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes Embedding Temporal Logic to perform runtime monitoring directly in learned embedding spaces rather than first mapping sensor data to discrete low-dimensional states. Traditional approaches need extra learned modules that tend to be costly, fragile, and semantically off. ETL instead treats predicates as distance comparisons between current embeddings and target embeddings taken from reference observations. These distance-based predicates combine with temporal operators to describe sequences of perceptual behaviors such as approaching a visual goal or staying away from certain semantic areas. Monitors for bounded traces plus conformal calibration give reliable predicate results, and tests in manipulation settings show close agreement with ground-truth labels.

Core claim

Embedding Temporal Logic (ETL) is a temporal logic that performs monitoring directly in learned embedding spaces. It defines predicates through distances between observed embeddings and target embeddings derived from reference observations. This formulation allows specifications to capture high-level perceptual concepts, such as similarity to visual goals or avoidance of semantic regions, that are difficult or impossible to express using traditional predicates. By composing these predicates with temporal operators, ETL naturally expresses temporally extended and sequential perceptual behaviors. Monitors evaluate specifications over bounded embedding traces, with a conformal calibration step,

What carries the argument

Embedding Temporal Logic (ETL), which defines predicates via distance comparisons between observed embeddings and reference-derived target embeddings to represent perceptual concepts.

If this is right

  • Specifications can now express temporally extended perceptual behaviors without requiring separate state-abstraction modules.
  • Conformal calibration supplies safety-oriented reliability guarantees for predicate satisfaction.
  • Evaluations demonstrate accurate monitoring of both atomic perceptual predicates and their temporal compositions in manipulation environments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Embedding-space monitoring could eliminate the need for separate perception-to-state translation layers in many autonomous pipelines.
  • If embedding distances prove consistent across environments, the same ETL formulas might transfer to new tasks without retraining the logic itself.
  • The distance-based predicate style might support natural-language-style task descriptions such as 'stay near objects like the training set' once reference embeddings are collected.

Load-bearing premise

Distances computed in the learned embedding space must align reliably with the intended perceptual meanings so that the predicates remain semantically valid.

What would settle it

An experiment showing an embedding distance that reports high similarity for two observations a human would judge as perceptually dissimilar, causing an ETL monitor to report false satisfaction of a specification.

Figures

Figures reproduced from arXiv: 2605.12651 by Abigail Hammer, Ashish Kapoor, Eunsuk Kang, Karen Leung, Parv Kapoor.

Figure 1
Figure 1. Figure 1: Overview of embedding-based runtime monitoring. [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Qualitative and quantitative ETL results. [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Dual-predicate Boolean timelines for mw-pick-place-wall. Top panels show distances to grasp (zA) and place (zB) embeddings with thresholds ϵ ∗ and ϵCP . Lower panels show ETL and ground-truth predicate activations. corresponding to a pick-and-place interaction. For tasks where proprioception is not sufficient to identify phase of tasks, we manually annotate the videos and generate the ground truth predicat… view at source ↗
read the original abstract

Runtime monitoring of autonomous systems traditionally relies on mapping continuous sensor observations to discrete logical propositions defined over low-dimensional state variables. This abstraction breaks down in perception-driven settings, where such mappings require additional learned modules that are often computationally expensive, brittle, and semantically misaligned. In this work, we propose Embedding Temporal Logic (ETL), a temporal logic that performs monitoring directly in learned embedding spaces. ETL defines predicates through distances between observed embeddings and target embeddings derived from reference observations. This formulation allows specifications to capture high-level perceptual concepts, such as similarity to visual goals or avoidance of semantic regions, that are difficult or impossible to express using traditional predicates. By composing these predicates with temporal operators, ETL naturally expresses temporally extended and sequential perceptual behaviors. We introduce ETL monitors for evaluating specifications over bounded embedding traces, along with a conformal calibration procedure that provides reliable and safety-oriented predicate evaluation. We evaluate our approach across multiple manipulation environments to show that ETL achieves strong empirical agreement with ground-truth semantics, including accurate monitoring of temporally composed behaviors.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces Embedding Temporal Logic (ETL) for runtime monitoring of perception-based autonomous systems. ETL defines predicates directly in learned embedding spaces via distances between observed embeddings and target embeddings from reference observations, enabling high-level perceptual concepts such as visual similarity or semantic avoidance. These predicates are composed with standard temporal operators to express sequential behaviors. The work provides monitors for bounded embedding traces and a conformal calibration procedure to ensure reliable predicate evaluation with statistical guarantees. Empirical results on manipulation tasks show strong agreement with ground-truth semantics for both atomic and temporally composed specifications.

Significance. If the embedding-distance predicates prove semantically reliable and the conformal guarantees hold under the stated assumptions, ETL would offer a meaningful advance for runtime verification in perception-driven autonomy. It sidesteps brittle low-level state abstractions by operating natively in embedding spaces, which is practically relevant for vision-based systems. The combination of temporal logic with conformal prediction supplies a concrete path toward safety-oriented monitoring with finite-sample coverage, and the reported empirical agreement on manipulation tasks suggests immediate applicability if the gaps in metric detail and assumption analysis are addressed.

major comments (3)
  1. [Conformal calibration] Conformal calibration section: the procedure claims finite-sample coverage for predicate reliability, yet the manuscript does not specify the nonconformity score (e.g., whether it is raw distance or a normalized variant) nor the calibration-set construction for embedding traces. This detail is load-bearing for the safety claims, as coverage can fail if the score does not satisfy exchangeability under perceptual distribution shift.
  2. [Predicate definition] Predicate definition: the central claim that distances to reference embeddings capture perceptual concepts (e.g., 'similarity to visual goals') is presented as enabling high-level specifications, but no analysis is given of when the embedding metric aligns with human-interpretable semantics versus when it collapses under minor visual perturbations. This assumption directly affects whether the logic remains meaningful beyond the reported tasks.
  3. [Empirical evaluation] Empirical evaluation: while agreement with ground-truth semantics is reported, the section provides no quantitative breakdown (e.g., precision/recall per temporal operator or false-positive rates on composed specifications), nor error analysis for cases where embedding distances deviate from intended predicates. These metrics are necessary to substantiate the claim of accurate monitoring of temporally extended behaviors.
minor comments (2)
  1. [Abstract] The abstract refers to 'multiple manipulation environments' without naming them or providing links to datasets; adding this information would improve reproducibility.
  2. [Notation] Notation for observed versus target embeddings is introduced but used inconsistently in the monitor pseudocode; a single clarifying table or definition box would help.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below with clarifications drawn from the manuscript and indicate planned revisions to improve clarity and rigor.

read point-by-point responses
  1. Referee: [Conformal calibration] Conformal calibration section: the procedure claims finite-sample coverage for predicate reliability, yet the manuscript does not specify the nonconformity score (e.g., whether it is raw distance or a normalized variant) nor the calibration-set construction for embedding traces. This detail is load-bearing for the safety claims, as coverage can fail if the score does not satisfy exchangeability under perceptual distribution shift.

    Authors: We thank the referee for this important observation. The nonconformity score is the raw Euclidean distance between the observed embedding and the target reference embedding; no normalization is applied. The calibration set is formed from a held-out collection of reference observations drawn from the same perceptual distribution used for the test traces. We will revise the conformal calibration section to state these choices explicitly and add a paragraph discussing the exchangeability assumption together with its sensitivity to distribution shift. revision: yes

  2. Referee: [Predicate definition] Predicate definition: the central claim that distances to reference embeddings capture perceptual concepts (e.g., 'similarity to visual goals') is presented as enabling high-level specifications, but no analysis is given of when the embedding metric aligns with human-interpretable semantics versus when it collapses under minor visual perturbations. This assumption directly affects whether the logic remains meaningful beyond the reported tasks.

    Authors: The predicates rely on distances in the learned embedding space precisely because the embedding model is trained to encode perceptual similarity. While the current manuscript does not contain a dedicated robustness analysis, the reported experiments on manipulation tasks demonstrate close agreement with ground-truth semantics. In revision we will insert a short discussion of the conditions under which the embedding metric is expected to align with intended concepts and the risk of collapse under perturbations, together with guidance on embedding-model selection. revision: partial

  3. Referee: [Empirical evaluation] Empirical evaluation: while agreement with ground-truth semantics is reported, the section provides no quantitative breakdown (e.g., precision/recall per temporal operator or false-positive rates on composed specifications), nor error analysis for cases where embedding distances deviate from intended predicates. These metrics are necessary to substantiate the claim of accurate monitoring of temporally extended behaviors.

    Authors: We agree that finer-grained quantitative metrics would strengthen the evaluation. We will expand the empirical section to report precision and recall separately for atomic predicates and for each temporal operator, include false-positive rates on composed specifications, and add an error analysis highlighting cases where embedding distances deviate from the intended predicate semantics. revision: yes

Circularity Check

0 steps flagged

No significant circularity in ETL derivation chain

full rationale

The paper defines Embedding Temporal Logic (ETL) by introducing predicates as distances between observed embeddings and target embeddings from reference observations, then composes them with standard temporal operators and applies conformal calibration for predicate reliability. This construction draws on established embedding learning and conformal prediction frameworks with independent grounding outside the paper; the central claims do not reduce by the paper's own equations to self-referential definitions, fitted parameters renamed as predictions, or load-bearing self-citations. Empirical agreement with ground-truth semantics on manipulation tasks serves as external validation rather than internal forcing. The derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that learned embeddings provide a semantically meaningful metric space for perceptual predicates and that conformal calibration yields reliable safety-oriented evaluations.

axioms (1)
  • domain assumption Distances in learned embedding spaces align with perceptual semantic similarity
    This underpins the predicate definitions for high-level concepts like visual goal similarity.

pith-pipeline@v0.9.0 · 5479 in / 1102 out tokens · 41898 ms · 2026-05-15T05:00:04.080641+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

140 extracted references · 140 canonical work pages · 11 internal anchors

  1. [1]

    2024 IEEE International Conference on Robotics and Automation (ICRA) , pages=

    Universal visual decomposer: Long-horizon manipulation made easy , author=. 2024 IEEE International Conference on Robotics and Automation (ICRA) , pages=. 2024 , organization=

  2. [2]

    Hauke and Donner, Reik V

    Kraemer, K. Hauke and Donner, Reik V. and Heitzig, Jobst and Marwan, Norbert , year=. Recurrence threshold selection for obtaining robust recurrence characteristics in different embedding dimensions , volume=. Chaos: An Interdisciplinary Journal of Nonlinear Science , publisher=. doi:10.1063/1.5024914 , number=

  3. [3]

    Lang2LTL-2: Grounding Spatiotemporal Navigation Commands Using Large Language and Vision-Language Models , year=

    Liu, Jason Xinyu and Shah, Ankit and Konidaris, George and Tellex, Stefanie and Paulius, David , booktitle=. Lang2LTL-2: Grounding Spatiotemporal Navigation Commands Using Large Language and Vision-Language Models , year=

  4. [4]

    2025 , eprint=

    AnySafe: Adapting Latent Safety Filters at Runtime via Safety Constraint Parameterization in the Latent Space , author=. 2025 , eprint=

  5. [5]

    2021 , howpublished =

    Harini Kannan and Danijar Hafner and Chelsea Finn and Dumitru Erhan , title =. 2021 , howpublished =

  6. [6]

    The Eleventh International Conference on Learning Representations,

    Yecheng Jason Ma and Shagun Sodhani and Dinesh Jayaraman and Osbert Bastani and Vikash Kumar and Amy Zhang , title =. The Eleventh International Conference on Learning Representations,. 2023 , url =

  7. [7]

    2024 , eprint=

    TD-MPC2: Scalable, Robust World Models for Continuous Control , author=. 2024 , eprint=

  8. [8]

    2020 , eprint=

    Dream to Control: Learning Behaviors by Latent Imagination , author=. 2020 , eprint=

  9. [9]

    Nature , year=

    Hafner, Danijar and Pasukonis, Jurgis and Ba, Jimmy and Lillicrap, Timothy , title=. Nature , year=. doi:10.1038/s41586-025-08744-2 , url=

  10. [10]

    Chapter 4 - Deep metric learning for computer vision: A brief overview , editor =

    Deen Dayal Mohan and Bhavin Jawade and Srirangaraj Setlur and Venu Govindaraju , keywords =. Chapter 4 - Deep metric learning for computer vision: A brief overview , editor =. 2023 , booktitle =. doi:https://doi.org/10.1016/bs.host.2023.01.003 , url =

  11. [11]

    2024 , eprint=

    FMB: a Functional Manipulation Benchmark for Generalizable Robotic Learning , author=. 2024 , eprint=

  12. [12]

    2022 , eprint=

    R3M: A Universal Visual Representation for Robot Manipulation , author=. 2022 , eprint=

  13. [13]

    2026 , eprint=

    World Simulation with Video Foundation Models for Physical AI , author=. 2026 , eprint=

  14. [14]

    2026 , eprint=

    Robometer: Scaling General-Purpose Robotic Reward Models via Trajectory Comparisons , author=. 2026 , eprint=

  15. [15]

    STLCG++: A Masking Approach for Differentiable Signal Temporal Logic Specification , year=

    Kapoor, Parv and Mizuta, Kazuki and Kang, Eunsuk and Leung, Karen , journal=. STLCG++: A Masking Approach for Differentiable Signal Temporal Logic Specification , year=

  16. [16]

    2024 , journal=

    DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset , author =. 2024 , journal=

  17. [17]

    2025 , eprint=

    Scaling Laws of Motion Forecasting and Planning -- A Technical Report , author=. 2025 , eprint=

  18. [18]

    and Dixon, Clare and Fisher, Michael , title =

    Luckcuck, Matt and Farrell, Marie and Dennis, Louise A. and Dixon, Clare and Fisher, Michael , title =. ACM Comput. Surv. , month = sep, articleno =. 2019 , publisher =

  19. [19]

    arXiv preprint arXiv:2603.29868 , year=

    Spatiotemporal Robustness of Temporal Logic Tasks using Multi-Objective Reasoning , author=. arXiv preprint arXiv:2603.29868 , year=

  20. [20]

    2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN) , pages=

    Successful swarms: operator situational awareness with modelling and verification at runtime , author=. 2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN) , pages=. 2023 , organization=

  21. [21]

    , booktitle=

    Lin, Zhenyu and Baras, John S. , booktitle=. Planning and Runtime Monitoring of Robotic Manipulator using Metric Interval Temporal Logic , year=

  22. [22]

    International Conference on Learning Representations (ICLR) , year=

    Learning Massively Multitask World Models for Continuous Control , author=. International Conference on Learning Representations (ICLR) , year=

  23. [23]

    and Bates, Stephen , title =

    Angelopoulos, Anastasios N. and Bates, Stephen , title =. Found. Trends Mach. Learn. , month = mar, pages =. 2023 , issue_date =. doi:10.1561/2200000101 , abstract =

  24. [24]

    2017 , eprint=

    Distribution-Free Predictive Inference For Regression , author=. 2017 , eprint=

  25. [25]

    2005 , isbn =

    Vovk, Vladimir and Gammerman, Alex and Shafer, Glenn , title =. 2005 , isbn =

  26. [26]

    Physical Intelligence and Ali Amin and Raichelle Aniceto and Ashwin Balakrishna and Kevin Black and Ken Conley and Grace Connors and James Darpinian and Karan Dhabalia and Jared DiCarlo and Danny Driess and Michael Equi and Adnan Esmail and Yunhao Fang and Chelsea Finn and Catherine Glossop and Thomas Godden and Ivan Goryachev and Lachy Groom and Hunter H...

  27. [27]

    OpenVLA: An Open-Source Vision-Language-Action Model

    OpenVLA: An Open-Source Vision-Language-Action Model , author=. arXiv preprint arXiv:2406.09246 , year=

  28. [28]

    2026 , eprint=

    World Action Models are Zero-shot Policies , author=. 2026 , eprint=

  29. [29]

    2024 , eprint=

    Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution , author=. 2024 , eprint=

  30. [30]

    2026 , eprint=

    Ctrl-World: A Controllable Generative World Model for Robot Manipulation , author=. 2026 , eprint=

  31. [31]

    2025 , eprint=

    Failure Prediction at Runtime for Generative Robot Policies , author=. 2025 , eprint=

  32. [32]

    2025 , eprint=

    Can We Detect Failures Without Failure Data? Uncertainty-Aware Runtime Failure Detection for Imitation Learning Policies , author=. 2025 , eprint=

  33. [33]

    2023 , eprint=

    Model-Based Runtime Monitoring with Interactive Imitation Learning , author=. 2023 , eprint=

  34. [34]

    2023 , eprint=

    Flow Matching for Generative Modeling , author=. 2023 , eprint=

  35. [35]

    The Twelfth International Conference on Learning Representations , year=

    Towards Diverse Behaviors: A Benchmark for Imitation Learning with Human Demonstrations , author=. The Twelfth International Conference on Learning Representations , year=

  36. [36]

    2024 , eprint=

    Multi-Task Interactive Robot Fleet Learning with Visual World Models , author=. 2024 , eprint=

  37. [37]

    2022 , eprint=

    Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off , author=. 2022 , eprint=

  38. [38]

    2026 , eprint=

    Temporal Straightening for Latent Planning , author=. 2026 , eprint=

  39. [39]

    Forty-second International Conference on Machine Learning , year=

    Online Conformal Prediction via Online Optimization , author=. Forty-second International Conference on Machine Learning , year=

  40. [40]

    Scaling Learning Algorithms Towards

    Bengio, Yoshua and LeCun, Yann , year = 2007, booktitle =. Scaling Learning Algorithms Towards

  41. [41]

    Neural Computation , volume = 18, pages =

    A Fast Learning Algorithm for Deep Belief Nets , author =. Neural Computation , volume = 18, pages =

  42. [42]

    Automated Technology for Verification and Analysis , pages =

    Formal Specification for Deep Neural Networks , author =. Automated Technology for Verification and Analysis , pages =

  43. [43]

    Deep learning , author =

  44. [44]

    doi:10.1177/027836499000900206 , url =

    McGeer, Tad , year = 1990, journal =. doi:10.1177/027836499000900206 , url =

  45. [45]

    18th Annual Symposium on Foundations of Computer Science , publisher =

    The Temporal Logic of Programs , author =. 18th Annual Symposium on Foundations of Computer Science , publisher =

  46. [46]

    Journal of Basic Engineering , publisher =

    A new approach to linear filtering and prediction problems , author =. Journal of Basic Engineering , publisher =

  47. [47]

    2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages =

    Pact: Perception-action causal transformer for autoregressive robotics pre-training , author =. 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages =

  48. [48]

    Theoretical Computer Science , volume = 410, number = 42, pages =

    Robustness of temporal logic specifications for continuous-time signals , author =. Theoretical Computer Science , volume = 410, number = 42, pages =

  49. [49]

    FORMATS/FTRTFT , url =

    Monitoring Temporal Properties of Continuous Signals , author =. FORMATS/FTRTFT , url =

  50. [50]

    International conference on machine learning , pages =

    Learning transferable visual models from natural language supervision , author =. International conference on machine learning , pages =

  51. [51]

    Bert: Pre-training of deep bidirectional transformers for language understanding , author =

  52. [52]

    Mixed-Integer Programming for Signal Temporal Logic with Fewer Binary Variables , author =

  53. [53]

    IEEE Transactions on Automatic Control , volume = 64, number = 2, pages =

    Formal Synthesis of Control Strategies for Positive Monotone Systems , author =. IEEE Transactions on Automatic Control , volume = 64, number = 2, pages =

  54. [54]

    NASA Formal Methods Symposium , pages =

    Safe Planning Through Incremental Decomposition of Signal Temporal Logic Specifications , author =. NASA Formal Methods Symposium , pages =

  55. [55]

    OpenAI Codex , url =

    Faulty Reward Functions in the Wild , author =. OpenAI Codex , url =

  56. [56]

    Annual Review of Control, Robotics, and Autonomous Systems , publisher =

    Robots that use language , author =. Annual Review of Control, Robotics, and Autonomous Systems , publisher =

  57. [57]

    IEEE Access , publisher =

    Chatgpt for robotics: Design principles and model abilities , author =. IEEE Access , publisher =

  58. [58]

    The annual research report , url =

    Rapidly-exploring random trees : a new tool for path planning , author =. The annual research report , url =

  59. [59]

    The international journal of robotics research , publisher =

    Sampling-based algorithms for optimal motion planning , author =. The international journal of robotics research , publisher =

  60. [60]

    A generalist agent , author =

  61. [61]

    2009 IEEE International Conference on Robotics and Automation , volume =

    Manipulation planning on constraint manifolds , author =. 2009 IEEE International Conference on Robotics and Automation , volume =. doi:10.1109/ROBOT.2009.5152399 , keywords =

  62. [62]

    IEEE Transactions on Robotics and Automation , volume = 12, number = 4, pages =

    Probabilistic roadmaps for path planning in high-dimensional configuration spaces , author =. IEEE Transactions on Robotics and Automation , volume = 12, number = 4, pages =. doi:10.1109/70.508439 , keywords =

  63. [63]

    , author =

    A Framework for Behavioural Cloning. , author =. Machine Intelligence 15 , pages =

  64. [64]

    Proceedings of the fourteenth international conference on artificial intelligence and statistics , pages =

    A reduction of imitation learning and structured prediction to no-regret online learning , author =. Proceedings of the fourteenth international conference on artificial intelligence and statistics , pages =

  65. [65]

    , author =

    Algorithms for inverse reinforcement learning. , author =

  66. [66]

    IEEE Transactions on Control Systems Technology , publisher =

    Obstacle avoidance for low-speed autonomous vehicles with barrier function , author =. IEEE Transactions on Control Systems Technology , publisher =

  67. [67]

    2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages =

    Reactive and safe road user simulations using neural barrier certificates , author =. 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages =

  68. [68]

    IEEE Robotics and Automation Letters , publisher =

    Learning safe, generalizable perception-based hybrid control with certificates , author =. IEEE Robotics and Automation Letters , publisher =

  69. [69]

    Advances in neural information processing systems , volume = 34, pages =

    Decision transformer: Reinforcement learning via sequence modeling , author =. Advances in neural information processing systems , volume = 34, pages =

  70. [70]

    A Survey on Vision-Language-Action Models for Embodied AI , author =

  71. [71]

    Nl2tl: Transforming natural languages to temporal logics using large language models , author =

  72. [72]

    ConBaT: Control Barrier Transformer for Safe Policy Learning , author =

  73. [73]

    Specification Patterns for Robotic Missions

    Specification Patterns for Robotic Missions , author =. CoRR , volume =. 1901.02077 , timestamp =

  74. [74]

    Challenges in close-proximity safe and seamless operation of manned and unmanned aircraft in shared airspace , author =

  75. [75]

    Smart: Self-supervised multi-task pretraining with control transformers , author =

  76. [76]

    Model predictive control with signal temporal logic specifications , author =

  77. [77]

    Reactive synthesis from signal temporal logic specifications , author =

  78. [78]

    2310.00887 , archiveprefix =

    GRID: A Platform for General Robot Intelligence Development , author =. 2310.00887 , archiveprefix =

  79. [80]

    RoFormer: Enhanced Transformer with Rotary Position Embedding

    RoFormer: Enhanced Transformer with Rotary Position Embedding , author =. 2104.09864 , archiveprefix =

  80. [81]

    2411.04983 , archiveprefix =

    DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning , author =. 2411.04983 , archiveprefix =

Showing first 80 references.