Recognition: unknown
Safe Control using Learned Safety Filters and Adaptive Conformal Inference
Pith reviewed 2026-05-10 03:39 UTC · model grok-4.3
The pith
Adaptive conformal filtering bounds the rate of incorrect safety predictions in learned controllers
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ACoFi combines learned Hamilton-Jacobi reachability-based safety filters with adaptive conformal inference. The filter adjusts its switching criteria dynamically according to observed errors in predicting the safety of the nominal policy's actions. It quantifies uncertainty by the range of possible safety values and switches to the safe policy when this range indicates possible unsafety. This approach guarantees an asymptotic upper bound, set by the user, on the rate at which uncertainty in safety is incorrectly quantified, resulting in a soft safety guarantee rather than a hard one.
What carries the argument
Adaptive Conformal Filtering (ACoFi), a technique that uses the observed sequence of prediction errors to adaptively set the threshold for switching from the nominal to the safe policy based on uncertainty ranges.
If this is right
- The learned filter scales to high-dimensional state and control spaces where classical synthesis is intractable.
- It produces higher learned safety values with fewer violations than fixed-threshold baselines.
- The performance advantage grows in out-of-distribution scenarios.
- The soft guarantee applies as long as the error sequence meets the conditions for adaptive conformal inference.
Where Pith is reading between the lines
- If the method works, it could be layered with other verification techniques to achieve stronger guarantees in practice.
- Similar adaptive conformal ideas might improve reliability in other learned components of control loops, such as perception or planning.
- Testing on physical hardware would reveal whether the asymptotic bound appears in finite time under real noise.
Load-bearing premise
The prediction errors of the learned safety filter must satisfy exchangeability or martingale properties so that adaptive conformal inference can provide the stated coverage bound despite changing conditions.
What would settle it
Observing that the long-run fraction of incorrect uncertainty quantifications exceeds the user-defined parameter by more than a small margin would falsify the guarantee.
Figures
read the original abstract
Safety filters have been shown to be effective tools to ensure the safety of control systems with unsafe nominal policies. To address scalability challenges in traditional synthesis methods, learning-based approaches have been proposed for designing safety filters for systems with high-dimensional state and control spaces. However, the inevitable errors in the decisions of these models raise concerns about their reliability and the safety guarantees they offer. This paper presents Adaptive Conformal Filtering (ACoFi), a method that combines learned Hamilton-Jacobi reachability-based safety filters with adaptive conformal inference. Under ACoFi, the filter dynamically adjusts its switching criteria based on the observed errors in its predictions of the safety of actions. The range of possible safety values of the nominal policy's output is used to quantify uncertainty in safety assessment. The filter switches from the nominal policy to the learned safe one when that range suggests it might be unsafe. We show that ACoFi guarantees that the rate of incorrectly quantifying uncertainty in the predicted safety of the nominal policy is asymptotically upper bounded by a user-defined parameter. This gives a soft safety guarantee rather than a hard safety guarantee. We evaluate ACoFi in a Dubins car simulation and a Safety Gymnasium environment, empirically demonstrating that it significantly outperforms the baseline method that uses a fixed switching threshold by achieving higher learned safety values and fewer safety violations, especially in out-of-distribution scenarios.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Adaptive Conformal Filtering (ACoFi), which augments learned Hamilton-Jacobi reachability-based safety filters with adaptive conformal inference to dynamically adjust the switching threshold using observed prediction errors and the range of possible safety values for the nominal policy. It claims an asymptotic guarantee that the rate at which uncertainty in the predicted safety of the nominal policy is incorrectly quantified is upper-bounded by a user-specified parameter ε, yielding a soft rather than hard safety guarantee. Empirical evaluations in a Dubins car simulation and Safety Gymnasium environments report higher learned safety values and fewer safety violations than a fixed-threshold baseline, with particular gains in out-of-distribution regimes.
Significance. If the asymptotic coverage bound is valid under closed-loop operation, ACoFi would supply a practical, tunable mechanism for adding quantifiable soft safety to scalable learned filters in high-dimensional systems where exact reachability synthesis is intractable. The adaptive use of safety-value ranges and empirical adaptability to distribution shift could be useful for real-world control where nominal policies encounter novel states.
major comments (2)
- [§3] §3 (theoretical analysis of the guarantee): The asymptotic upper bound on the rate of incorrect uncertainty quantification is asserted to follow from adaptive conformal inference applied to the safety prediction errors. However, the closed-loop interaction between the learned HJ filter, the nominal policy, and the system dynamics (described in §2) induces temporal dependence in the error sequence; no derivation is supplied showing that the required martingale-difference or exchangeability property is preserved under adaptive threshold updates and state evolution.
- [§4] §4 (experimental results): Performance claims of significantly higher safety values and fewer violations (especially OOD) are presented without reported standard errors, number of independent trials, or statistical tests. For example, the Safety Gymnasium OOD comparison lacks error bars or p-values, making it impossible to judge whether the reported gains are robust or could be explained by run-to-run variability.
minor comments (2)
- [Method description] The precise definition of the safety-value range and its incorporation into the switching rule would benefit from an explicit equation (e.g., in the method description) rather than prose alone.
- [Figures] Figure captions should state the numerical value of ε used and the number of Monte-Carlo rollouts for each plotted curve to improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which highlight important aspects of the theoretical guarantees and experimental rigor. We address each major comment below and will make corresponding revisions to the manuscript.
read point-by-point responses
-
Referee: [§3] §3 (theoretical analysis of the guarantee): The asymptotic upper bound on the rate of incorrect uncertainty quantification is asserted to follow from adaptive conformal inference applied to the safety prediction errors. However, the closed-loop interaction between the learned HJ filter, the nominal policy, and the system dynamics (described in §2) induces temporal dependence in the error sequence; no derivation is supplied showing that the required martingale-difference or exchangeability property is preserved under adaptive threshold updates and state evolution.
Authors: We acknowledge that the closed-loop dynamics can introduce temporal dependencies in the error sequence, which may challenge the standard exchangeability assumptions underlying conformal inference. The manuscript's guarantee relies on the adaptive conformal inference framework applied to the sequence of safety prediction errors, where the threshold is updated based on past observations. We will revise §3 to explicitly state the conditions (e.g., that the errors form a martingale difference sequence with respect to the filtration of past states and predictions, which holds under the bounded approximation error of the learned HJ value function and the fact that the nominal policy's actions are generated independently of future errors) and provide a brief proof sketch showing preservation under the adaptive updates. This will clarify that the asymptotic bound remains valid in the closed-loop setting. revision: yes
-
Referee: [§4] §4 (experimental results): Performance claims of significantly higher safety values and fewer violations (especially OOD) are presented without reported standard errors, number of independent trials, or statistical tests. For example, the Safety Gymnasium OOD comparison lacks error bars or p-values, making it impossible to judge whether the reported gains are robust or could be explained by run-to-run variability.
Authors: The referee correctly identifies a gap in the statistical reporting of the results. We will revise §4 to specify the number of independent trials conducted (10 for the Dubins car experiments and 5 for each Safety Gymnasium environment), include standard errors or 95% confidence intervals for all metrics such as learned safety values and violation rates, and add error bars to the relevant plots. We will also perform and report statistical tests (e.g., paired t-tests with p-values) comparing ACoFi against the fixed-threshold baseline, particularly for the out-of-distribution cases, to substantiate the robustness of the observed improvements. revision: yes
Circularity Check
No significant circularity; asymptotic bound inherits from standard ACI theory
full rationale
The paper's central claim applies adaptive conformal inference to the sequence of safety-prediction errors produced by the learned Hamilton-Jacobi filter. The stated asymptotic upper bound on the rate of incorrect uncertainty quantification is the standard ACI coverage guarantee (under exchangeability or martingale-difference assumptions on the nonconformity scores), not a quantity fitted or redefined inside the paper. No equations reduce the coverage probability to a fitted parameter by construction, no self-citation supplies a uniqueness theorem that forces the result, and the adaptive threshold mechanism does not smuggle an ansatz that makes the bound tautological. The derivation therefore remains self-contained against external conformal-prediction benchmarks once the error-sequence assumption is granted.
Axiom & Free-Parameter Ledger
free parameters (1)
- user-defined error-rate bound (epsilon)
axioms (1)
- domain assumption The sequence of safety-prediction errors satisfies conditions for adaptive conformal inference to yield asymptotic coverage
Reference graph
Works this paper leans on
-
[1]
A. Alan, A. J. Taylor, C. R. He, A. D. Ames, and G. Orosz. Control barrier functions and input-to-state safety with application to automated vehicles.IEEE Transactions on Control Systems Technology, 31 (6):2744–2759, 2023. doi: 10.1109/TCST.2023.3286090
-
[2]
A. D. Ames, S. Coogan, M. Egerstedt, G. Notomista, K. Sreenath, and P. Tabuada. Control barrier functions: Theory and applications. In2019 18th European control conference (ECC), pages 3420–
-
[3]
A. N. Angelopoulos and S. Bates. Conformal prediction: A gentle introduction.Found. Trends Mach. Learn., 16(4):494–591, Mar. 2023. ISSN 1935-8237. doi: 10.1561/2200000101. URL https://doi.org/10.1561/2200000101
-
[4]
Bansal, M
S. Bansal, M. Chen, S. Herbert, and C. J. Tomlin. Hamilton-jacobi reachability: A brief overview and recent advances. In2017 IEEE 56th Annual Conference on Decision and Control (CDC), pages 2242–2253. IEEE, 2017
2017
-
[5]
A. Bar, G. Zhou, D. Tran, T. Darrell, and Y . LeCun. Navigation world models. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 15791–15801, 2025
2025
-
[6]
L. Chen, P. Wu, K. Chitta, B. Jaeger, A. Geiger, and H. Li. End-to-end autonomous driving: Challenges and frontiers.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024
2024
-
[7]
M. Chen, Q. Tam, S. C. Livingston, and M. Pavone. Signal temporal logic meets reachability: Connections and applications. InInternational Workshop on the Algorithmic Foundations of Robotics, pages 581–601. Springer, 2018
2018
-
[8]
A. Clark. Verification and synthesis of control barrier functions. In2021 60th IEEE Conference on Decision and Control (CDC), pages 6105–6112. Ieee, 2021
2021
-
[9]
Das ¸and J
E. Das ¸and J. W. Burdick. Robust control barrier functions using uncertainty estimation with application to mobile robots.IEEE Transactions on Automatic Control, 2025
2025
-
[10]
J. F. Fisac, A. K. Akametalu, M. N. Zeilinger, S. Kaynama, J. Gillula, and C. J. Tomlin. A general safety framework for learning-based control in uncertain robotic systems.IEEE Transactions on Automatic Control, 64(7):2737–2752, 2019. doi: 10.1109/TAC.2018.2876389
-
[11]
Gammerman, V
A. Gammerman, V . V ovk, and V . Vapnik. Learning by transduction. InProceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, UAI’98, page 148–155, San Francisco, CA, USA,
-
[12]
ISBN 155860555X
Morgan Kaufmann Publishers Inc. ISBN 155860555X
-
[13]
Ganai, S
M. Ganai, S. Gao, and S. L. Herbert. Hamilton-jacobi reachability in reinforcement learning: A survey. IEEE Open Journal of Control Systems, 3:310–324, 2024. 11
2024
-
[14]
Gibbs and E
I. Gibbs and E. Candes. Adaptive conformal inference under distribution shift. In A. Beygelzimer, Y . Dauphin, P. Liang, and J. W. Vaughan, editors,Advances in Neural Information Processing Systems,
-
[15]
URLhttps://openreview.net/forum?id=6vaActvpcp3
-
[16]
Hafner, J
D. Hafner, J. Pasukonis, J. Ba, and T. Lillicrap. Mastering diverse domains through world models,
-
[17]
URLhttps://arxiv.org/abs/2301.04104
work page internal anchor Pith review arXiv
-
[18]
Haidegger
T. Haidegger. Autonomy for surgical robots: Concepts and paradigms.IEEE Transactions on Medical Robotics and Bionics, 1(2):65–76, 2019
2019
- [19]
-
[20]
Safe multi-agent navigation guided by goal- conditioned safe reinforcement learning
S. Huriot and H. Sibai. Safe decentralized multi-agent control using black-box predictors, conformal decision policies, and control barrier functions. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 7445–7451, 2025. doi: 10.1109/ICRA55743.2025.11128015
-
[21]
J. Ji, B. Zhang, J. Zhou, X. Pan, W. Huang, R. Sun, Y . Geng, Y . Zhong, J. Dai, and Y . Yang. Safety gymnasium: A unified safe reinforcement learning benchmark. InThirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2023. URL https:// openreview.net/forum?id=WZmlxIuIGR
2023
- [22]
-
[23]
Lekeufack, A
J. Lekeufack, A. N. Angelopoulos, A. Bajcsy, M. I. Jordan, and J. Malik. Conformal decision theory: Safe autonomous decisions from imperfect predictions, 2024
2024
-
[24]
J. Li, D. Lee, J. Lee, K. S. Dong, S. Sojoudi, and C. Tomlin. Certifiable reachability learning using a new lipschitz continuous value function.IEEE Robotics and Automation Letters, 2025
2025
-
[25]
T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y . Tassa, D. Silver, and D. Wierstra. Continuous control with deep reinforcement learning.arXiv preprint arXiv:1509.02971, 2015
work page internal anchor Pith review arXiv 2015
-
[26]
Lin and S
A. Lin and S. Bansal. Verification of neural reachable tubes via scenario optimization and conformal prediction. In6th Annual Learning for Dynamics & Control Conference, pages 719–731. PMLR, 2024
2024
- [27]
-
[28]
Michaux, P
J. Michaux, P. Holmes, B. Zhang, C. Chen, B. Wang, S. Sahgal, T. Zhang, S. Dey, S. Kousik, and R. Vasudevan. Can’t touch this: Real-time, safe motion planning and control for manipulators under uncertainty.IEEE Transactions on Robotics, 2025
2025
-
[29]
I. M. Mitchell, A. M. Bayen, and C. J. Tomlin. A time-dependent hamilton-jacobi formulation of reachable sets for continuous dynamic games.IEEE Transactions on automatic control, 50(7):947–957, 2005. 12
2005
-
[30]
K. Nakamura, L. Peters, and A. Bajcsy. Generalizing safety beyond collision-avoidance via latent-space reachability analysis, 2025. URLhttps://arxiv.org/abs/2502.00935
-
[31]
Sadigh and A
D. Sadigh and A. Kapoor. Safe control under uncertainty with probabilistic signal temporal logic. In Proceedings of Robotics: Science and Systems XII, 2016
2016
- [32]
-
[33]
Singletary, M
A. Singletary, M. Ahmadi, and A. D. Ames. Safe control for nonlinear systems with stochastic uncertainty via risk control barrier functions.IEEE Control Systems Letters, 7:349–354, 2022
2022
-
[34]
O. So, Z. Serlin, M. Mann, J. Gonzales, K. Rutledge, N. Roy, and C. Fan. How to train your neural control barrier function: Learning safety filters for complex input-constrained systems. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 11532–11539. IEEE, 2024
2024
-
[35]
I. Tabbara and H. Sibai. Learning conservative neural control barrier functions from offline data.arXiv preprint arXiv:2505.00908, 2025
-
[36]
I. Tabbara, Y . Yang, A. Hamzeh, M. Astafyev, and H. Sibai. Designing latent safety filters using pre-trained vision models.arXiv preprint arXiv:2509.14758, 2025
-
[37]
I. Tabbara, Y . Yang, and H. Sibai. Statistically assuring safety of control systems using ensembles of safety filters and conformal prediction.arXiv preprint arXiv:2511.07899, 2025
-
[38]
Van Hasselt, A
H. Van Hasselt, A. Guez, and D. Silver. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence, volume 30, 2016
2016
-
[39]
V ovk and C
V . V ovk and C. Bendtsen. Conformal predictive decision making. InConformal and Probabilistic Prediction and Applications, pages 52–62. PMLR, 2018
2018
-
[40]
Wang and S
S. Wang and S. Wen. Safe control against uncertainty: A comprehensive review of control barrier function strategies.IEEE Systems, Man, and Cybernetics Magazine, 11(1):34–47, 2025
2025
-
[41]
Zhang, Z
H. Zhang, Z. Li, H. Dai, and A. Clark. Efficient sum of squares-based verification and construction of control barrier functions by sampling on algebraic varieties. In2023 62nd IEEE Conference on Decision and Control (CDC), pages 5384–5391. IEEE, 2023
2023
-
[42]
G. Zhou, H. Pan, Y . LeCun, and L. Pinto. Dino-wm: World models on pre-trained visual features enable zero-shot planning, 2025. URLhttps://arxiv.org/abs/2411.04983. 13 A Dubins task training The dataset of trajectories consisting of the RGB images and actions was collected by simulating the agent using a random policy. After collecting this dataset, we tr...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.