arxiv: 2604.18482 · v1 · submitted 2026-04-20 · 📡 eess.SY · cs.LG· cs.RO· cs.SY

Recognition: unknown

Safe Control using Learned Safety Filters and Adaptive Conformal Inference

Sacha Huriot , Ihab Tabbara , Hussein Sibai

Authors on Pith no claims yet

Pith reviewed 2026-05-10 03:39 UTC · model grok-4.3

classification 📡 eess.SY cs.LGcs.ROcs.SY

keywords safety filtersadaptive conformal inferencelearned control policiessoft safety guaranteesHamilton-Jacobi reachabilitydistribution shiftcontrol systemsuncertainty quantification

0 comments

The pith

Adaptive conformal filtering bounds the rate of incorrect safety predictions in learned controllers

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces ACoFi, which pairs learned safety filters based on Hamilton-Jacobi reachability with adaptive conformal inference to handle uncertainty in high-dimensional control systems. It dynamically changes when to override the nominal policy by looking at the range of possible safety values and past prediction errors. This setup ensures that the fraction of times the safety assessment is wrongly uncertain stays below a chosen limit in the long run, delivering a soft safety guarantee instead of an absolute one. Simulations on a Dubins car and in Safety Gymnasium show it yields safer and higher-performing control than using a static threshold, especially when the system encounters new conditions.

Core claim

ACoFi combines learned Hamilton-Jacobi reachability-based safety filters with adaptive conformal inference. The filter adjusts its switching criteria dynamically according to observed errors in predicting the safety of the nominal policy's actions. It quantifies uncertainty by the range of possible safety values and switches to the safe policy when this range indicates possible unsafety. This approach guarantees an asymptotic upper bound, set by the user, on the rate at which uncertainty in safety is incorrectly quantified, resulting in a soft safety guarantee rather than a hard one.

What carries the argument

Adaptive Conformal Filtering (ACoFi), a technique that uses the observed sequence of prediction errors to adaptively set the threshold for switching from the nominal to the safe policy based on uncertainty ranges.

If this is right

The learned filter scales to high-dimensional state and control spaces where classical synthesis is intractable.
It produces higher learned safety values with fewer violations than fixed-threshold baselines.
The performance advantage grows in out-of-distribution scenarios.
The soft guarantee applies as long as the error sequence meets the conditions for adaptive conformal inference.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the method works, it could be layered with other verification techniques to achieve stronger guarantees in practice.
Similar adaptive conformal ideas might improve reliability in other learned components of control loops, such as perception or planning.
Testing on physical hardware would reveal whether the asymptotic bound appears in finite time under real noise.

Load-bearing premise

The prediction errors of the learned safety filter must satisfy exchangeability or martingale properties so that adaptive conformal inference can provide the stated coverage bound despite changing conditions.

What would settle it

Observing that the long-run fraction of incorrect uncertainty quantifications exceeds the user-defined parameter by more than a small margin would falsify the guarantee.

Figures

Figures reproduced from arXiv: 2604.18482 by Hussein Sibai, Ihab Tabbara, Sacha Huriot.

**Figure 1.** Figure 1: During evaluation, π task is a PID controller that steers the agent towards the goal, without any consideration for the obstacles. Each run consists of reaching the goal in the top right of the environment five times before the timeout. When the goal is reached or a wall of the environment is hit (not the obstacles), the agent is placed in a random starting position in the lower left. We discuss the data c… view at source ↗

**Figure 2.** Figure 2: Graphs of Vθ for Dubins car agents using π fixed (red) and ACoFi (green), under the same VarSpeed&Steer OOD scenario, with safety threshold ε = 0.1 (gray). The selected time frame shows both agents completing two goal-reaching tasks and being put back in a starting position afterwards. The circle markers plot the lower bound Bt , which is sometimes set to −∞ forcing a switch to π safe θ in the case of ACo… view at source ↗

read the original abstract

Safety filters have been shown to be effective tools to ensure the safety of control systems with unsafe nominal policies. To address scalability challenges in traditional synthesis methods, learning-based approaches have been proposed for designing safety filters for systems with high-dimensional state and control spaces. However, the inevitable errors in the decisions of these models raise concerns about their reliability and the safety guarantees they offer. This paper presents Adaptive Conformal Filtering (ACoFi), a method that combines learned Hamilton-Jacobi reachability-based safety filters with adaptive conformal inference. Under ACoFi, the filter dynamically adjusts its switching criteria based on the observed errors in its predictions of the safety of actions. The range of possible safety values of the nominal policy's output is used to quantify uncertainty in safety assessment. The filter switches from the nominal policy to the learned safe one when that range suggests it might be unsafe. We show that ACoFi guarantees that the rate of incorrectly quantifying uncertainty in the predicted safety of the nominal policy is asymptotically upper bounded by a user-defined parameter. This gives a soft safety guarantee rather than a hard safety guarantee. We evaluate ACoFi in a Dubins car simulation and a Safety Gymnasium environment, empirically demonstrating that it significantly outperforms the baseline method that uses a fixed switching threshold by achieving higher learned safety values and fewer safety violations, especially in out-of-distribution scenarios.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ACoFi layers adaptive conformal inference on learned HJ safety filters to get an asymptotic bound on misquantified safety uncertainty, but the martingale assumption looks fragile in closed loop.

read the letter

The paper's main contribution is a method called ACoFi that takes a learned Hamilton-Jacobi reachability safety filter and adds adaptive conformal inference. It uses the range of predicted safety values for the nominal policy to measure uncertainty and then adjusts the switching threshold online based on observed prediction errors. The claim is an asymptotic upper bound on the rate of incorrect uncertainty quantification, set by a user parameter, which they call a soft safety guarantee rather than a hard one.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes Adaptive Conformal Filtering (ACoFi), which augments learned Hamilton-Jacobi reachability-based safety filters with adaptive conformal inference to dynamically adjust the switching threshold using observed prediction errors and the range of possible safety values for the nominal policy. It claims an asymptotic guarantee that the rate at which uncertainty in the predicted safety of the nominal policy is incorrectly quantified is upper-bounded by a user-specified parameter ε, yielding a soft rather than hard safety guarantee. Empirical evaluations in a Dubins car simulation and Safety Gymnasium environments report higher learned safety values and fewer safety violations than a fixed-threshold baseline, with particular gains in out-of-distribution regimes.

Significance. If the asymptotic coverage bound is valid under closed-loop operation, ACoFi would supply a practical, tunable mechanism for adding quantifiable soft safety to scalable learned filters in high-dimensional systems where exact reachability synthesis is intractable. The adaptive use of safety-value ranges and empirical adaptability to distribution shift could be useful for real-world control where nominal policies encounter novel states.

major comments (2)

[§3] §3 (theoretical analysis of the guarantee): The asymptotic upper bound on the rate of incorrect uncertainty quantification is asserted to follow from adaptive conformal inference applied to the safety prediction errors. However, the closed-loop interaction between the learned HJ filter, the nominal policy, and the system dynamics (described in §2) induces temporal dependence in the error sequence; no derivation is supplied showing that the required martingale-difference or exchangeability property is preserved under adaptive threshold updates and state evolution.
[§4] §4 (experimental results): Performance claims of significantly higher safety values and fewer violations (especially OOD) are presented without reported standard errors, number of independent trials, or statistical tests. For example, the Safety Gymnasium OOD comparison lacks error bars or p-values, making it impossible to judge whether the reported gains are robust or could be explained by run-to-run variability.

minor comments (2)

[Method description] The precise definition of the safety-value range and its incorporation into the switching rule would benefit from an explicit equation (e.g., in the method description) rather than prose alone.
[Figures] Figure captions should state the numerical value of ε used and the number of Monte-Carlo rollouts for each plotted curve to improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects of the theoretical guarantees and experimental rigor. We address each major comment below and will make corresponding revisions to the manuscript.

read point-by-point responses

Referee: [§3] §3 (theoretical analysis of the guarantee): The asymptotic upper bound on the rate of incorrect uncertainty quantification is asserted to follow from adaptive conformal inference applied to the safety prediction errors. However, the closed-loop interaction between the learned HJ filter, the nominal policy, and the system dynamics (described in §2) induces temporal dependence in the error sequence; no derivation is supplied showing that the required martingale-difference or exchangeability property is preserved under adaptive threshold updates and state evolution.

Authors: We acknowledge that the closed-loop dynamics can introduce temporal dependencies in the error sequence, which may challenge the standard exchangeability assumptions underlying conformal inference. The manuscript's guarantee relies on the adaptive conformal inference framework applied to the sequence of safety prediction errors, where the threshold is updated based on past observations. We will revise §3 to explicitly state the conditions (e.g., that the errors form a martingale difference sequence with respect to the filtration of past states and predictions, which holds under the bounded approximation error of the learned HJ value function and the fact that the nominal policy's actions are generated independently of future errors) and provide a brief proof sketch showing preservation under the adaptive updates. This will clarify that the asymptotic bound remains valid in the closed-loop setting. revision: yes
Referee: [§4] §4 (experimental results): Performance claims of significantly higher safety values and fewer violations (especially OOD) are presented without reported standard errors, number of independent trials, or statistical tests. For example, the Safety Gymnasium OOD comparison lacks error bars or p-values, making it impossible to judge whether the reported gains are robust or could be explained by run-to-run variability.

Authors: The referee correctly identifies a gap in the statistical reporting of the results. We will revise §4 to specify the number of independent trials conducted (10 for the Dubins car experiments and 5 for each Safety Gymnasium environment), include standard errors or 95% confidence intervals for all metrics such as learned safety values and violation rates, and add error bars to the relevant plots. We will also perform and report statistical tests (e.g., paired t-tests with p-values) comparing ACoFi against the fixed-threshold baseline, particularly for the out-of-distribution cases, to substantiate the robustness of the observed improvements. revision: yes

Circularity Check

0 steps flagged

No significant circularity; asymptotic bound inherits from standard ACI theory

full rationale

The paper's central claim applies adaptive conformal inference to the sequence of safety-prediction errors produced by the learned Hamilton-Jacobi filter. The stated asymptotic upper bound on the rate of incorrect uncertainty quantification is the standard ACI coverage guarantee (under exchangeability or martingale-difference assumptions on the nonconformity scores), not a quantity fitted or redefined inside the paper. No equations reduce the coverage probability to a fitted parameter by construction, no self-citation supplies a uniqueness theorem that forces the result, and the adaptive threshold mechanism does not smuggle an ansatz that makes the bound tautological. The derivation therefore remains self-contained against external conformal-prediction benchmarks once the error-sequence assumption is granted.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard conformal prediction coverage under adaptive updates plus the assumption that the learned filter's error process permits valid range-based uncertainty quantification. The user-defined bound parameter is a free choice rather than a fitted constant.

free parameters (1)

user-defined error-rate bound (epsilon)
User-chosen asymptotic upper bound on the rate of incorrect safety assessments; directly sets the switching aggressiveness.

axioms (1)

domain assumption The sequence of safety-prediction errors satisfies conditions for adaptive conformal inference to yield asymptotic coverage
Invoked to obtain the stated guarantee on incorrect uncertainty quantification.

pith-pipeline@v0.9.0 · 5551 in / 1253 out tokens · 37635 ms · 2026-05-10T03:39:25.522416+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 15 canonical work pages · 2 internal anchors

[1]

A. Alan, A. J. Taylor, C. R. He, A. D. Ames, and G. Orosz. Control barrier functions and input-to-state safety with application to automated vehicles.IEEE Transactions on Control Systems Technology, 31 (6):2744–2759, 2023. doi: 10.1109/TCST.2023.3286090

work page doi:10.1109/tcst.2023.3286090 2023
[2]

A. D. Ames, S. Coogan, M. Egerstedt, G. Notomista, K. Sreenath, and P. Tabuada. Control barrier functions: Theory and applications. In2019 18th European control conference (ECC), pages 3420–
[3]

A. N. Angelopoulos and S. Bates. Conformal prediction: A gentle introduction.Found. Trends Mach. Learn., 16(4):494–591, Mar. 2023. ISSN 1935-8237. doi: 10.1561/2200000101. URL https://doi.org/10.1561/2200000101

work page doi:10.1561/2200000101 2023
[4]

Bansal, M

S. Bansal, M. Chen, S. Herbert, and C. J. Tomlin. Hamilton-jacobi reachability: A brief overview and recent advances. In2017 IEEE 56th Annual Conference on Decision and Control (CDC), pages 2242–2253. IEEE, 2017

2017
[5]

A. Bar, G. Zhou, D. Tran, T. Darrell, and Y . LeCun. Navigation world models. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 15791–15801, 2025

2025
[6]

L. Chen, P. Wu, K. Chitta, B. Jaeger, A. Geiger, and H. Li. End-to-end autonomous driving: Challenges and frontiers.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024

2024
[7]

M. Chen, Q. Tam, S. C. Livingston, and M. Pavone. Signal temporal logic meets reachability: Connections and applications. InInternational Workshop on the Algorithmic Foundations of Robotics, pages 581–601. Springer, 2018

2018
[8]

A. Clark. Verification and synthesis of control barrier functions. In2021 60th IEEE Conference on Decision and Control (CDC), pages 6105–6112. Ieee, 2021

2021
[9]

Das ¸and J

E. Das ¸and J. W. Burdick. Robust control barrier functions using uncertainty estimation with application to mobile robots.IEEE Transactions on Automatic Control, 2025

2025
[10]

J. F. Fisac, A. K. Akametalu, M. N. Zeilinger, S. Kaynama, J. Gillula, and C. J. Tomlin. A general safety framework for learning-based control in uncertain robotic systems.IEEE Transactions on Automatic Control, 64(7):2737–2752, 2019. doi: 10.1109/TAC.2018.2876389

work page doi:10.1109/tac.2018.2876389 2019
[11]

Gammerman, V

A. Gammerman, V . V ovk, and V . Vapnik. Learning by transduction. InProceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, UAI’98, page 148–155, San Francisco, CA, USA,
[12]

ISBN 155860555X

Morgan Kaufmann Publishers Inc. ISBN 155860555X
[13]

Ganai, S

M. Ganai, S. Gao, and S. L. Herbert. Hamilton-jacobi reachability in reinforcement learning: A survey. IEEE Open Journal of Control Systems, 3:310–324, 2024. 11

2024
[14]

Gibbs and E

I. Gibbs and E. Candes. Adaptive conformal inference under distribution shift. In A. Beygelzimer, Y . Dauphin, P. Liang, and J. W. Vaughan, editors,Advances in Neural Information Processing Systems,
[15]

URLhttps://openreview.net/forum?id=6vaActvpcp3
[16]

Hafner, J

D. Hafner, J. Pasukonis, J. Ba, and T. Lillicrap. Mastering diverse domains through world models,
[17]

URLhttps://arxiv.org/abs/2301.04104

work page internal anchor Pith review arXiv
[18]

Haidegger

T. Haidegger. Autonomy for surgical robots: Concepts and paradigms.IEEE Transactions on Medical Robotics and Bionics, 1(2):65–76, 2019

2019
[19]

P. Hu, X. Qian, W. Deng, R. Wang, H. Feng, R. Feng, T. Zhang, L. Wei, Y . Wang, Z.-M. Ma, et al. From uncertain to safe: Conformal fine-tuning of diffusion models for safe pde control.arXiv preprint arXiv:2502.02205, 2025

work page arXiv 2025
[20]

Safe multi-agent navigation guided by goal- conditioned safe reinforcement learning

S. Huriot and H. Sibai. Safe decentralized multi-agent control using black-box predictors, conformal decision policies, and control barrier functions. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 7445–7451, 2025. doi: 10.1109/ICRA55743.2025.11128015

work page doi:10.1109/icra55743.2025.11128015 2025
[21]

J. Ji, B. Zhang, J. Zhou, X. Pan, W. Huang, R. Sun, Y . Geng, Y . Zhong, J. Dai, and Y . Yang. Safety gymnasium: A unified safe reinforcement learning benchmark. InThirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2023. URL https:// openreview.net/forum?id=WZmlxIuIGR

2023
[22]

M. Kim, W. Sharpless, H. J. Jeong, S. Tonkens, S. Bansal, and S. Herbert. Reachability barrier networks: Learning hamilton-jacobi solutions for smooth and flexible control barrier functions.arXiv preprint arXiv:2505.11755, 2025

work page arXiv 2025
[23]

Lekeufack, A

J. Lekeufack, A. N. Angelopoulos, A. Bajcsy, M. I. Jordan, and J. Malik. Conformal decision theory: Safe autonomous decisions from imperfect predictions, 2024

2024
[24]

J. Li, D. Lee, J. Lee, K. S. Dong, S. Sojoudi, and C. Tomlin. Certifiable reachability learning using a new lipschitz continuous value function.IEEE Robotics and Automation Letters, 2025

2025
[25]

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y . Tassa, D. Silver, and D. Wierstra. Continuous control with deep reinforcement learning.arXiv preprint arXiv:1509.02971, 2015

work page internal anchor Pith review arXiv 2015
[26]

Lin and S

A. Lin and S. Bansal. Verification of neural reachable tubes via scenario optimization and conformal prediction. In6th Annual Learning for Dynamics & Control Conference, pages 719–731. PMLR, 2024

2024
[27]

A. Lin, S. Peng, and S. Bansal. One filter to deploy them all: Robust safety for quadrupedal navigation in unknown environments, 2024. URLhttps://arxiv.org/abs/2412.09989

work page arXiv 2024
[28]

Michaux, P

J. Michaux, P. Holmes, B. Zhang, C. Chen, B. Wang, S. Sahgal, T. Zhang, S. Dey, S. Kousik, and R. Vasudevan. Can’t touch this: Real-time, safe motion planning and control for manipulators under uncertainty.IEEE Transactions on Robotics, 2025

2025
[29]

I. M. Mitchell, A. M. Bayen, and C. J. Tomlin. A time-dependent hamilton-jacobi formulation of reachable sets for continuous dynamic games.IEEE Transactions on automatic control, 50(7):947–957, 2005. 12

2005
[30]

2502.00935 , archiveprefix =

K. Nakamura, L. Peters, and A. Bajcsy. Generalizing safety beyond collision-avoidance via latent-space reachability analysis, 2025. URLhttps://arxiv.org/abs/2502.00935

work page arXiv 2025
[31]

Sadigh and A

D. Sadigh and A. Kapoor. Safe control under uncertainty with probabilistic signal temporal logic. In Proceedings of Robotics: Science and Systems XII, 2016

2016
[32]

J. Seo, K. Nakamura, and A. Bajcsy. Uncertainty-aware latent safety filters for avoiding out-of- distribution failures, 2025. URLhttps://arxiv.org/abs/2505.00779

work page arXiv 2025
[33]

Singletary, M

A. Singletary, M. Ahmadi, and A. D. Ames. Safe control for nonlinear systems with stochastic uncertainty via risk control barrier functions.IEEE Control Systems Letters, 7:349–354, 2022

2022
[34]

O. So, Z. Serlin, M. Mann, J. Gonzales, K. Rutledge, N. Roy, and C. Fan. How to train your neural control barrier function: Learning safety filters for complex input-constrained systems. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 11532–11539. IEEE, 2024

2024
[35]

Tabbara and H

I. Tabbara and H. Sibai. Learning conservative neural control barrier functions from offline data.arXiv preprint arXiv:2505.00908, 2025

work page arXiv 2025
[36]

Tabbara, Y

I. Tabbara, Y . Yang, A. Hamzeh, M. Astafyev, and H. Sibai. Designing latent safety filters using pre-trained vision models.arXiv preprint arXiv:2509.14758, 2025

work page arXiv 2025
[37]

Tabbara, Y

I. Tabbara, Y . Yang, and H. Sibai. Statistically assuring safety of control systems using ensembles of safety filters and conformal prediction.arXiv preprint arXiv:2511.07899, 2025

work page arXiv 2025
[38]

Van Hasselt, A

H. Van Hasselt, A. Guez, and D. Silver. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence, volume 30, 2016

2016
[39]

V ovk and C

V . V ovk and C. Bendtsen. Conformal predictive decision making. InConformal and Probabilistic Prediction and Applications, pages 52–62. PMLR, 2018

2018
[40]

Wang and S

S. Wang and S. Wen. Safe control against uncertainty: A comprehensive review of control barrier function strategies.IEEE Systems, Man, and Cybernetics Magazine, 11(1):34–47, 2025

2025
[41]

Zhang, Z

H. Zhang, Z. Li, H. Dai, and A. Clark. Efficient sum of squares-based verification and construction of control barrier functions by sampling on algebraic varieties. In2023 62nd IEEE Conference on Decision and Control (CDC), pages 5384–5391. IEEE, 2023

2023
[42]

G. Zhou, H. Pan, Y . LeCun, and L. Pinto. Dino-wm: World models on pre-trained visual features enable zero-shot planning, 2025. URLhttps://arxiv.org/abs/2411.04983. 13 A Dubins task training The dataset of trajectories consisting of the RGB images and actions was collected by simulating the agent using a random policy. After collecting this dataset, we tr...

work page arXiv 2025