Are Safety Guarantees in Neural Networks Safe? How to Compute Trustworthy Robustness Certifications
Pith reviewed 2026-06-26 08:51 UTC · model grok-4.3
The pith
Apothem-optimal robustness certifications for neural networks require only a linear number of verifier calls in the input domain diameter.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce the apothem measure and show how to compute apothem-optimal certifications in a linear number of calls to a NN verifier (oracle) w.r.t. the input domain's diameter. Moreover, we prove that we cannot have a volume-optimal, oracle-based algorithm, even if we discard the oracle costs. Also, we introduce dual certifications -- an interval including all instances of a class -- thus providing apothem-minimum upper bounds to a robustness certification. Further, we present the ParallelepipedoNN system, which we evaluate on the standard MNIST and Fashion MNIST benchmarks.
What carries the argument
The apothem measure, the minimum distance from a certification hyper-rectangle's center to any of its faces, used to select the largest axis-aligned robustness region via repeated oracle queries to a neural network verifier.
If this is right
- Apothem-optimal certifications are computable with a linear number of oracle calls relative to input domain diameter.
- No volume-optimal certification algorithm exists that relies only on oracle calls, even when oracle cost is ignored.
- Dual certifications supply apothem-minimum upper bounds on single-point robustness regions.
- The ParallelepipedoNN implementation yields at least a two-fold increase in minimum certified edge length on MNIST and Fashion MNIST.
Where Pith is reading between the lines
- Systems that prioritize fast certification over maximum volume could adopt the apothem approach for routine safety checks.
- The oracle model highlights the need to account for real verifier inaccuracies when deploying these certifications.
- Dual certifications open the possibility of class-level safety guarantees rather than pointwise ones.
Load-bearing premise
That a neural network verifier functions as a perfect oracle whose only relevant cost is the number of calls and that existing proofs of intractability for volume-optimal certification apply unchanged to this oracle model.
What would settle it
An explicit algorithm that computes a volume-optimal certification using only polynomially many calls to a perfect verifier, or an experimental run on MNIST inputs where the number of oracle calls needed for an apothem-optimal certification exceeds linear scaling with domain diameter.
Figures
read the original abstract
A primary challenge in AI safety is the existence of adversarial examples -- slightly distorted inputs that cause a neural network (NN) to misclassify. To mitigate this problem, recent research focuses on the computation of robustness certifications, which, for a given input, determine the largest distortion the input may receive without breaking the network's prediction. Robustness certifications can be interpreted as an axis-aligned hyper-rectangle (multi-dimensional intervals). Most existing approaches focus on maximizing the certification's volume, but recent intractability results prohibit the computation of volume-optimal certifications in reasonable time. We introduce the apothem measure and show how to compute apothem-optimal certifications in a linear number of calls to a NN verifier (oracle) w.r.t. the input domain's diameter. Moreover, we prove that we cannot have a volume-optimal, oracle-based algorithm, even if we discard the oracle costs. Also, we introduce dual certifications -- an interval including all instances of a class -- thus providing apothem-minimum upper bounds to a robustness certification. Further, we present the ParallelepipedoNN system, which we evaluate on the standard MNIST and Fashion MNIST benchmarks. A preliminary comparison with existing work on the same datasets reveals at least two-fold improvement w.r.t. the minimum edge length.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces the apothem measure as an alternative to volume for robustness certifications of neural networks against adversarial examples. It claims an algorithm to compute apothem-optimal certifications via a linear number of calls (w.r.t. input domain diameter) to a perfect NN verifier oracle, proves that no volume-optimal oracle-based algorithm exists even ignoring oracle costs, introduces dual certifications as apothem-minimum upper bounds, and presents the ParallelepipedoNN system showing at least two-fold improvement in minimum edge length over prior work on MNIST and Fashion MNIST.
Significance. If the central claims hold, the work supplies a tractable oracle-based route to apothem-optimal robustness certificates together with an impossibility result separating it from volume maximization; the linear complexity and dual-certification construction are concrete strengths. The empirical two-fold gain on standard benchmarks would indicate practical relevance if the experimental controls are sound. The manuscript does not ship machine-checked proofs or open code, but the oracle-model analysis is a clear contribution if the reduction is tight.
major comments (3)
- [Impossibility proof for volume-optimal algorithms] The proof that no volume-optimal, oracle-based algorithm exists (abstract and the dedicated impossibility section) is load-bearing for the positioning against volume maximization. The manuscript must explicitly verify that its oracle (perfect yes/no robustness queries) matches the decision-oracle model, query type, and domain representation used in the cited intractability results; any mismatch (e.g., box queries vs. arbitrary linear constraints) would prevent the hardness result from transferring.
- [Apothem-optimal algorithm] § on the apothem-optimal algorithm: the linearity claim (linear in diameter) rests on treating the verifier as a perfect oracle whose only cost is the number of calls. The manuscript should state the precise query interface assumed and discuss how incompleteness or floating-point issues in real verifiers affect the claimed guarantees.
- [Experimental results] Empirical evaluation (MNIST/Fashion MNIST results): the two-fold improvement in minimum edge length is reported, yet the available description supplies neither the exact baselines, sample counts, nor how the minimum edge length is extracted from the certified regions, preventing assessment of whether the gain is robust or merely an artifact of the chosen metric.
minor comments (2)
- [Abstract] Abstract: the phrase 'preliminary comparison' should name the compared methods and the precise metric (minimum edge length) to allow immediate evaluation of the claimed improvement.
- [Preliminaries] Notation: ensure that 'apothem' and 'dual certification' are defined once with consistent symbols before their first algorithmic use.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major point below and will make revisions to strengthen the presentation of the oracle model, algorithm interface, and experiments.
read point-by-point responses
-
Referee: The proof that no volume-optimal, oracle-based algorithm exists (abstract and the dedicated impossibility section) is load-bearing for the positioning against volume maximization. The manuscript must explicitly verify that its oracle (perfect yes/no robustness queries) matches the decision-oracle model, query type, and domain representation used in the cited intractability results; any mismatch (e.g., box queries vs. arbitrary linear constraints) would prevent the hardness result from transferring.
Authors: We agree that explicit verification is required. Our impossibility proof uses a perfect yes/no decision oracle restricted to axis-aligned box queries, which directly matches the decision-oracle model and domain representation in the cited volume intractability results. We will add a dedicated paragraph in the impossibility section confirming this correspondence and ruling out any mismatch in query type. revision: yes
-
Referee: § on the apothem-optimal algorithm: the linearity claim (linear in diameter) rests on treating the verifier as a perfect oracle whose only cost is the number of calls. The manuscript should state the precise query interface assumed and discuss how incompleteness or floating-point issues in real verifiers affect the claimed guarantees.
Authors: We will revise the apothem-optimal algorithm section to state the precise interface: the oracle accepts an axis-aligned box and returns a yes/no answer on robust classification. The linear bound is strictly on the number of such calls. We will add a paragraph noting that the optimality claim is oracle-based; incomplete real verifiers may produce suboptimal (but still sound) certificates, and floating-point precision is an implementation concern outside the oracle model. revision: yes
-
Referee: Empirical evaluation (MNIST/Fashion MNIST results): the two-fold improvement in minimum edge length is reported, yet the available description supplies neither the exact baselines, sample counts, nor how the minimum edge length is extracted from the certified regions, preventing assessment of whether the gain is robust or merely an artifact of the chosen metric.
Authors: We will expand the experimental section to list the exact prior methods used as baselines with their reported minimum edge lengths, state the number of MNIST and Fashion MNIST test samples evaluated, and provide a precise description of how the minimum edge length is computed from each certified region. These additions will allow readers to verify the robustness of the reported two-fold gain. revision: yes
Circularity Check
No significant circularity detected; derivations are self-contained
full rationale
The paper defines the apothem measure independently of volume and presents an algorithm that computes apothem-optimal certifications via a linear number of oracle calls to a neural network verifier. It also claims to prove the non-existence of volume-optimal oracle-based algorithms. No equations or definitions reduce by construction to fitted parameters, self-referential inputs, or prior self-citations. The intractability result is stated as a proof internal to the paper rather than imported via overlapping-author citation. The oracle model is treated as given, with no evidence that the central claims collapse to renaming or ansatz smuggling. This is a normal non-finding for a paper whose core contributions rest on new definitions and an explicit proof rather than external self-referential support.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption A neural network verifier exists that can be queried as a black-box oracle for robustness queries.
Reference graph
Works this paper leans on
-
[1]
In: Latin American Theoretical Informatics Sympo- sium (2010)
Backer, J., Keil, J.M.: The Mono- and Bichromatic Empty Rectangle and Square Problems in All Dimensions. In: Latin American Theoretical Informatics Sympo- sium (2010)
2010
-
[2]
Discrete & Computational Geometry (2023)
Chan, T.M.: Faster Algorithms for Largest Empty Rectangles and Boxes. Discrete & Computational Geometry (2023)
2023
-
[3]
SIAM Journal on Computing (1986)
Chazelle, B., III, R.L.S.D., Lee, D.T.: Computing the Largest Empty Rectangle. SIAM Journal on Computing (1986)
1986
-
[4]
Revista Matemática Iberoamericana (1994)
Fefferman, C.: Reconstructing a neural net from its output. Revista Matemática Iberoamericana (1994)
1994
-
[5]
Constraints - An International Journal (2018)
Fischetti, M., Jo, J.: Deep neural networks and mixed integer linear optimization. Constraints - An International Journal (2018)
2018
-
[6]
In: Conference on Artificial Intelligence and Statistics (2010)
Glorot, X., Bengio, Y.: Understanding the Difficulty of Training Deep Feedforward Neural Networks. In: Conference on Artificial Intelligence and Statistics (2010)
2010
-
[7]
In: International Conference on Learning Representations (2015)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and Harnessing Adversarial Examples. In: International Conference on Learning Representations (2015)
2015
-
[8]
In: Scalable Scientific Data Management (2012)
Gutiérrez, G., Paramá, J.R.: Finding the Largest Empty Rectangle Containing Only a Query Point in Large Multidimensional Databases. In: Scalable Scientific Data Management (2012)
2012
-
[9]
In: Association for the Advancement of Artificial Intelligence (2019)
Ignatiev, A., Narodytska, N., Marques-Silva, J.: Abduction-Based Explanations for Machine Learning Models. In: Association for the Advancement of Artificial Intelligence (2019)
2019
-
[10]
Principles of Knowledge Representation and Reasoning (2024)
Izza, Y., Huang, X., Morgado, A., Planes, J., Ignatiev, A., Marques-Silva, J.: Distance-Restricted Explanations: Theoretical Underpinnings & Efficient Imple- mentation. Principles of Knowledge Representation and Reasoning (2024)
2024
-
[11]
In: Association for the Advancement of Artificial Intelligence (2024)
Izza, Y., Ignatiev, A., Stuckey, P.J., Marques-Silva, J.: Delivering Inflated Expla- nations. In: Association for the Advancement of Artificial Intelligence (2024)
2024
-
[12]
In: International Conference on Veri- fication, Model Checking, and Abstract Interpretation (2023)
Kabaha, A., Drachsler-Cohen, D.: Maximal Robust Neural Network Specifications via Oracle-Guided Numerical Optimization. In: International Conference on Veri- fication, Model Checking, and Abstract Interpretation (2023)
2023
-
[13]
In: Computer-Aided Verification (2017)
Katz, G., Barrett, C.W., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks. In: Computer-Aided Verification (2017)
2017
-
[14]
Formal Methods System Design (2022)
Katz, G., Barrett, C.W., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: a Calculus for Reasoning about Deep Neural Networks. Formal Methods System Design (2022)
2022
-
[15]
In: Computer-Aided Verification (2019)
Katz, G., Huang, D.A., Ibeling, D., Julian, K., Lazarus, C., Lim, R., Shah, P., Thakoor, S., Wu, H., Zeljic, A., Dill, D.L., Kochenderfer, M.J., Barrett, C.W.: The Marabou Framework for Verification and Analysis of Deep Neural Networks. In: Computer-Aided Verification (2019)
2019
-
[16]
In: Interna- tional Conference on Learning Representations (2015)
Kingma, D.P., Ba, J.: Adam: A Method for Stochastic Optimization. In: Interna- tional Conference on Learning Representations (2015)
2015
-
[17]
ATT Labs (2010)
LeCun, Y., Cortes, C., Burges, C.J.: MNIST handwritten digit database. ATT Labs (2010)
2010
-
[18]
IEEE Transactions on Dependable and Secure Computing (2022) 18 M
Li, C., Ji, S., Weng, H., Li, B., Shi, J., Beyah, R., Guo, S., Wang, Z., Wang, T.: Towards Certifying the Asymmetric Robustness for Neural Networks: Quantifica- tion and Applications. IEEE Transactions on Dependable and Secure Computing (2022) 18 M. Papamichail et al
2022
-
[19]
Foundations and Trends in Optimization (2019)
Liu, C., Arnon, T., Lazarus, C., Barrett, C.W., Kochenderfer, M.J.: Algorithms for Verifying Deep Neural Networks. Foundations and Trends in Optimization (2019)
2019
-
[20]
In: International Conference on Machine Learning (2019)
Liu, C., Tomioka, R., Cevher, V.: On Certifying Non-Uniform Bounds against Adversarial Attacks. In: International Conference on Machine Learning (2019)
2019
-
[21]
Computer Research Repository (2017)
Lomuscio, A., Maganti, L.: An approach to reachability analysis for feed-forward ReLU neural networks. Computer Research Repository (2017)
2017
-
[22]
Computing Research Repository (2022)
Meng, M.H., Bai, G., Teo, S.G., Hou, Z., Xiao, Y., Lin, Y., Dong, J.S.: Adver- sarial Robustness of Deep Neural Networks: A Survey from a Formal Verification Perspective. Computing Research Repository (2022)
2022
-
[23]
SIAM (2009)
Moore, R.E., Kearfott, R.B., Cloud, M.J.: Introduction to Interval Analysis. SIAM (2009)
2009
-
[24]
Discrete Applied Mathematics (1984)
Naamad, A., Lee, D.T., Hsu, W.L.: On the maximum empty rectangle problem. Discrete Applied Mathematics (1984)
1984
-
[25]
In: Advances in Neural Information Processing Systems (2019)
Salman, H., Yang, G., Zhang, H., Hsieh, C.J., Zhang, P.: A Convex Relaxation Bar- rier to Tight Robustness Verification of Neural Networks. In: Advances in Neural Information Processing Systems (2019)
2019
-
[26]
Proceeding of the ACM on Programming Languages (2019)
Singh, G., Gehr, T., Püschel, M., Vechev, M.T.: An Abstract Domain for Certifying Neural Networks. Proceeding of the ACM on Programming Languages (2019)
2019
-
[27]
Research Association of Applied Geometry (1958)
Sunaga, T.: Theory of an Interval Algebra and its Application to Numerical Anal- ysis. Research Association of Applied Geometry (1958)
1958
-
[28]
In: International Conference on Learning Representations (2014)
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I.J., Fergus, R.: Intriguing properties of neural networks. In: International Conference on Learning Representations (2014)
2014
-
[29]
In: International Conference on Machine Learning (2018)
Weng, T.W., Zhang, H., Chen, H., Song, Z., Hsieh, C.J., Daniel, L., Boning, D.S., Dhillon, I.S.: Towards Fast Computation of Certified Robustness for ReLU Net- works. In: International Conference on Machine Learning (2018)
2018
-
[30]
In: International Conference on Machine Learning (2018)
Wong, E., Kolter, J.Z.: Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope. In: International Conference on Machine Learning (2018)
2018
-
[31]
In: Computer-Aided Verification (2024)
Wu, H., Isac, O., Zeljic, A., Tagomori, T., Daggitt, M.L., Kokke, W., Refaeli, I., Amir, G., Julian, K., Bassan, S., Huang, P., Lahav, O., Wu, M., Zhang, M., Komendantskaya, E., Katz, G., Barrett, C.W.: Marabou 2.0: A Versatile Formal Analyzer of Neural Networks. In: Computer-Aided Verification (2024)
2024
-
[32]
In: Advances in Neural Information Processing Systems (2023)
Wu, M., Wu, H., Barrett, C.W.: VeriX: Towards Verified Explainability of Deep Neural Networks. In: Advances in Neural Information Processing Systems (2023)
2023
-
[33]
Computer Research Repository (2017)
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. Computer Research Repository (2017)
2017
-
[34]
In: International Conference on Learning Rep- resentations (2021) A Omitted proofs A.1 Proofs of Section 2 Proposition 1.For an intervalI∈I(d)| F x and the measures of Def
Xu, K., Zhang, H., Wang, S., Wang, Y., Jana, S., Lin, X., Hsieh, C.J.: Fast and Complete: Enabling Complete Neural Network Verification with Rapid and Mas- sively Parallel Incomplete Verifiers. In: International Conference on Learning Rep- resentations (2021) A Omitted proofs A.1 Proofs of Section 2 Proposition 1.For an intervalI∈I(d)| F x and the measure...
2021
-
[35]
There is no intervalJ∈I(d)| F x, s.t.x∈J,J∪ Q=J, andJ⊂(Q⊤x)
-
[36]
Moreover, letI, Jbe minimal, w.r.t
LetI, J∈I(d)| F x two intervals, s.t.(Q ∪ {x})⊆I, J. Moreover, letI, Jbe minimal, w.r.t. set inclusion, achieving this property. Then,I=J
-
[37]
Proof.The first desideratum results immediately from Def
For everyJ∈I(d)| F x, s.t.x∈J, andJ∪ Q=J, thenϖ(J)≥ϖ(Q⊤x). Proof.The first desideratum results immediately from Def. 7. For the second desideratum, we will prove that ifJ⊂(Q⊤x), then there is someq∈Q, s.t.q/∈J. LetQ⊤x=x+ [−r , r], withr , r≥0as in Def. 7. Consider thek-th coordinate of the upper endpointri. From Def. 7 there is some q∈ Q, s.t.q k = rk. Si...
-
[38]
After processing all points inQwe take,a n i = max j{xi −qj,i |q j ∈ Q}andb n i = max j{qj+1,i −xi |q j ∈ Q}
For the positive pointqk+1, the minimum interval update providesaj+1 i = max{aj i , xi −q j+1,i}, andb j+1 i = max{b j i , qj+1,i −x i}. After processing all points inQwe take,a n i = max j{xi −qj,i |q j ∈ Q}andb n i = max j{qj+1,i −xi |q j ∈ Q}. So, we have,I n =x+ [−a n,b n]. By Def. 7,R i = sup{x i −q i |q∈ Q}and Ri = sup{q i −x i |q∈ Q}. SinceQis fini...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.