pith. sign in

arxiv: 2606.09746 · v1 · pith:5L7PHXDKnew · submitted 2026-06-08 · 💻 cs.CV · cs.AI· cs.LG

Hybrid Robustness Verification for Spatio-Temporal Neural Networks

Pith reviewed 2026-06-27 17:09 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.LG
keywords robustness verification3D CNNspatio-temporal constraintsbound propagationvideo classificationadversarial robustnesscertified accuracy
0
0 comments X

The pith

Modeling adversarial attacks as changes to subsets of frames or patches yields tighter certified robustness bounds for 3D CNN video networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that realistic spatio-temporal constraints on adversarial perturbations—limited to subsets of frames or patches within consecutive frames—enable tighter and more scalable robustness verification than standard lp-norm bounds for 3D convolutional networks. Existing methods either over-approximate by allowing noise everywhere or become too slow, limiting their use in safety-critical video tasks such as action recognition and autonomous driving. By computing an exact closed-form bound for the first convolutional layer under these constraints and then propagating approximations through later layers, the method produces certified accuracy numbers that remain valid under the modeled attack model. The authors also release a benchmark to support further work on verifiable video models.

Core claim

STBP computes an exact closed-form characterization of the first convolutional layer under spatio-temporal perturbation constraints and propagates certified bounds through the remainder of a 3D CNN using scalable approximations, yielding stronger robustness guarantees than prior verification methods on video and volumetric inputs.

What carries the argument

Spatio-Temporal Bound Propagation (STBP), which replaces full lp-norm perturbation sets with constraints that allow modification of only a subset of frames or patches inside consecutive frames and derives exact output bounds for the initial convolution before using interval or other approximations downstream.

If this is right

  • Certified robust accuracy improves by a factor of 1.7 under identical budgets because the constraint set is smaller than the full lp-norm ball.
  • Verification becomes feasible on larger video networks and longer sequences because exact bounds are computed only for the first layer.
  • The same framework applies across action recognition, driving video, and medical volumetric data once the spatio-temporal constraint pattern is chosen.
  • ST-Bench supplies standardized datasets and threat models for comparing future verification algorithms on spatio-temporal inputs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the chosen frame or patch subsets match the structure of common physical attacks (occlusion, lighting changes over short intervals), the method could be used to certify models before deployment rather than only for research.
  • Extending the exact closed-form step to additional early layers might further reduce the need for approximations without losing scalability.

Load-bearing premise

Real-world adversaries are limited to changing only a subset of frames or patches within consecutive frames rather than being able to alter every frame independently.

What would settle it

An attack that succeeds against the network by modifying more frames or patches than the modeled constraint allows, while staying inside the same overall perturbation budget, would show that the certified accuracy no longer applies to that attack.

Figures

Figures reproduced from arXiv: 2606.09746 by Alessio Lomuscio, Matthew Wicker, Sherwin Varghese.

Figure 1
Figure 1. Figure 1: Overview of Spatio-Temporal Bound Propagation (STBP) for verifying spatio￾temporal adversarial patch propagation. induce incorrect predictions [45,14,11]. Such failures pose serious risks in high￾stakes applications, motivating the use of formal verification to provide rigorous robustness guarantees. Existing verification methods typically cast robustness as a non-convex optimization problem and rely on co… view at source ↗
Figure 2
Figure 2. Figure 2: Adversarial robustness of IBP, STBP-IBP, STBP-Lipschitz, and STBP￾Löwner–John Sampling using adversarial patches for an MNIST video model with 10 frames of size 28 × 28: (a) robustness against perturbation magnitude (ϵ); (b) ro￾bustness against patch size (k). VideoStar represents reachable sets using symbolic tuples (c, V, P, l, u), where c denotes an anchor video, V a set of generator videos, and P linea… view at source ↗
Figure 3
Figure 3. Figure 3: Per-layer average bound width for pure IBP versus STBP-IBP on the MNIST video model (10 frames, 28 × 28) at 3 perturbation magnitudes (ϵ ∈ {0.1, 0.01, 0.001}). STBP-IBP consistently produces tighter bounds at every layer, with the tightness advantage established at Conv1 propagating through Conv2, Conv3, FC1, and FC2. per-pixel attacker typically assumed in flattened-MNIST benchmarks. Smaller certifiable ϵ… view at source ↗
read the original abstract

With AI increasingly deployed in safety-critical systems, providing formal robustness guarantees for the underlying models is essential. Existing verification methods either rely on overly conservative approximations or incur prohibitive computational costs. For example, the use of lp-norm perturbations in video settings encodes the belief that the adversary can inject noise in every video frame. In practice, adversarial perturbations exhibit structured spatial and temporal correlations, constrained to lower-dimensional, semantically meaningful subspaces. In this work, we study robustness verification of 3D CNNs processing video and volumetric inputs, targeting applications in action recognition (UCF-101), autonomous driving (Udacity), and medical imaging (MedMNIST) exploiting realistic assumptions on adversarial strength by modelling them as spatio-temporal constraints - where the attacker can modify either a subset of frames or patches within a set of consecutive frames. We demonstrate that modelling realistic constraints enables tighter approximations. We introduce Spatio-Temporal Bound Propagation (STBP), a verification framework that computes an exact closed-form characterization of the first convolutional layer and propagates certified bounds through subsequent layers using scalable approximations. Computing the exact closed form provides the tightest bounds for the first convolutional layer. Thus, we utilise approximation methods in the remainder of the network. To spur further progress in this field, we propose ST-Bench, a verification benchmark for autonomous driving and activity recognition, to systematically evaluate verifiable robustness. Compared to existing verification-based approaches, STBP provides stronger robustness guarantees with significantly improved scalability, achieving 1.7x higher certified robust accuracy under identical perturbation budgets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces Spatio-Temporal Bound Propagation (STBP) for formal robustness verification of 3D CNNs on video and volumetric inputs. It replaces standard per-frame lp-norm perturbation models with spatio-temporal constraints (adversary limited to subsets of frames or patches within consecutive frames), derives an exact closed-form characterization of the first convolutional layer, and propagates bounds via scalable approximations through the remainder of the network. Evaluations on UCF-101, Udacity, and MedMNIST claim 1.7x higher certified robust accuracy versus prior verification methods under identical budgets; the work also proposes the ST-Bench benchmark.

Significance. If the experimental comparisons are shown to be under equivalent threat models, the approach could meaningfully advance scalable verification for spatio-temporal networks by obtaining tighter certified bounds through more realistic perturbation modeling, with direct relevance to safety-critical domains such as autonomous driving and medical imaging. The ST-Bench benchmark would be a useful community contribution for standardized evaluation.

major comments (2)
  1. [Abstract] Abstract: The central claim that STBP achieves '1.7x higher certified robust accuracy under identical perturbation budgets' compares against baselines that encode full per-frame lp-norm adversaries, while STBP restricts the adversary to a strictly smaller spatio-temporal constraint set. Because certified accuracy is monotonically non-decreasing as the allowed perturbation set shrinks, the reported gain may be driven by the modeling choice rather than the bound-propagation technique; no evidence is given that the baselines were re-evaluated under the identical spatio-temporal constraints.
  2. [Threat-model and experimental sections] Threat-model and experimental sections: The paper asserts that the spatio-temporal constraints enable tighter approximations, yet provides no separate ablation that isolates bound tightness from the change in perturbation set (e.g., by reporting certified accuracy of the same STBP procedure under both the restricted and the full lp-norm models). This distinction is load-bearing for the claim of 'stronger robustness guarantees' attributable to STBP itself.
minor comments (2)
  1. [Abstract] Abstract: The phrase 'under identical perturbation budgets' is used without an accompanying formal statement of how the budgets are equated across the two threat models.
  2. The manuscript would benefit from an explicit statement of the precise mathematical definition of the spatio-temporal constraint set (e.g., as a subset of the standard lp ball) to allow readers to verify the subset relation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and for highlighting the distinction between threat-model changes and algorithmic improvements. We agree that the reported gains conflate the two and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that STBP achieves '1.7x higher certified robust accuracy under identical perturbation budgets' compares against baselines that encode full per-frame lp-norm adversaries, while STBP restricts the adversary to a strictly smaller spatio-temporal constraint set. Because certified accuracy is monotonically non-decreasing as the allowed perturbation set shrinks, the reported gain may be driven by the modeling choice rather than the bound-propagation technique; no evidence is given that the baselines were re-evaluated under the identical spatio-temporal constraints.

    Authors: We accept the observation. The 1.7x figure compares STBP under the new spatio-temporal threat model against prior methods under their standard per-frame lp-norm models; the sets are not identical. The paper's intent is to advocate for the more realistic (smaller) perturbation sets, which inherently permit higher certified accuracy, and to supply an efficient verifier (STBP) for them. We will revise the abstract and add a dedicated paragraph in the experimental section stating that the numerical improvement arises from both the threat-model restriction and the verification procedure, and that baselines were not re-implemented under the spatio-temporal constraints because those methods assume independent per-frame perturbations. revision: yes

  2. Referee: [Threat-model and experimental sections] Threat-model and experimental sections: The paper asserts that the spatio-temporal constraints enable tighter approximations, yet provides no separate ablation that isolates bound tightness from the change in perturbation set (e.g., by reporting certified accuracy of the same STBP procedure under both the restricted and the full lp-norm models). This distinction is load-bearing for the claim of 'stronger robustness guarantees' attributable to STBP itself.

    Authors: We agree that such an ablation is necessary to separate the modeling effect from the bound-propagation technique. STBP's exact closed-form layer is derived specifically for the spatio-temporal constraints; the full lp-norm case can be recovered as a special instance by allowing perturbations in every frame. We will add an ablation that runs the identical STBP pipeline (exact first-layer bounds followed by the same scalable propagation) once under the restricted spatio-temporal model and once under the full lp-norm model on the same networks and datasets, thereby isolating the contribution of each factor. revision: yes

Circularity Check

0 steps flagged

No circularity detected; verification bounds derived independently from architecture and constraints

full rationale

The paper presents STBP as a new construction: an exact closed-form characterization of the first convolutional layer under spatio-temporal constraints, followed by scalable approximations for later layers. No equations reduce to fitted parameters renamed as predictions, no self-citations form load-bearing uniqueness arguments, and no ansatz is smuggled via prior work. The 1.7x certified accuracy claim is an experimental outcome under explicitly stated weaker perturbation models rather than a definitional tautology. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the domain assumption that realistic adversarial perturbations can be modeled as spatio-temporal subset constraints and that exact characterization of the first layer is feasible; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (2)
  • domain assumption Adversarial perturbations exhibit structured spatial and temporal correlations constrained to lower-dimensional subspaces.
    Invoked in abstract to justify moving beyond lp-norm perturbations.
  • domain assumption The first convolutional layer admits an exact closed-form characterization under the spatio-temporal constraints.
    Stated as providing the tightest bounds before approximations are applied to later layers.

pith-pipeline@v0.9.1-grok · 5810 in / 1316 out tokens · 15423 ms · 2026-06-27T17:09:32.073772+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

64 extracted references · 11 canonical work pages

  1. [1]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (June 2020)

    Agarwal, A., Vatsa, M., Singh, R., Ratha, N.K.: Noise is inside me! generating adversarial perturbations with noise derived from natural filters. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (June 2020)

  2. [2]

    In: Wallach, H.M., Larochelle, H., Beygelz- imer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R

    Balunovic, M., Baader, M., Singh, G., Gehr, T., Vechev, M.T.: Certifying geo- metric robustness of neural networks. In: Wallach, H.M., Larochelle, H., Beygelz- imer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neu- ral Information Processing Systems 32: Annual Conference on Neural Informa- tion Processing Systems 2019, NeurIPS 2019, Dec...

  3. [3]

    In: Proceedings of the 41st International Conference on Machine Learn- ing

    Banerjee, D., Singh, G.: Relational dnn verification with cross executional bound refinement. In: Proceedings of the 41st International Conference on Machine Learn- ing. ICML’24, JMLR.org (2024)

  4. [4]

    Botoeva, E., Kouvaros, P., Kronqvist, J., Lomuscio, A., Misener, R.: Efficient ver- ification of relu-based neural networks via dependency analysis. In: The Thirty- Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educationa...

  5. [5]

    Black, and Otmar Hilliges

    Che, H., Chen, S., Chen, H.: Image Quality-aware Diagnosis via Meta-knowledge Co-embedding . In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 19819–19829. IEEE Computer Society, Los Alamitos, CA, USA (Jun 2023). https://doi.org/10.1109/CVPR52729.2023.01898, https:// doi.ieeecomputersociety.org/10.1109/CVPR52729.2023.01898

  6. [6]

    In: Greenspan, H., Madabhushi, A., Mousavi, P., Salcudean, S., Duncan, J., Syeda-Mahmood, T., Taylor, R

    Che, H., Cheng, Y., Jin, H., Chen, H.: Towards generalizable diabetic retinopa- thy grading in unseen domains. In: Greenspan, H., Madabhushi, A., Mousavi, P., Salcudean, S., Duncan, J., Syeda-Mahmood, T., Taylor, R. (eds.) Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. pp. 430–440. Springer Nature Switzerland, Cham (2023)

  7. [7]

    In: 8th International Conference on Learning Repre- sentations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020

    Chiang, P., Ni, R., Abdelkader, A., Zhu, C., Studer, C., Goldstein, T.: Certified de- fenses for adversarial patches. In: 8th International Conference on Learning Repre- sentations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net (2020), https://openreview.net/forum?id=HyeaSkrYPH

  8. [8]

    In: The Twelfth International Conference on Learning Representations (2024)

    De Palma, A., Bunel, R.R., Dvijotham, K.D., Kumar, M.P., Stanforth, R., Lomus- cio, A.: Expressive losses for verified robustness via convex combinations. In: The Twelfth International Conference on Learning Representations (2024)

  9. [9]

    In: Proceedings of the 22nd ACM International Conference on Hybrid Systems: Computation and Con- trol (HSCC19)

    Dutta, S., Chen, X., Sankaranarayanan, S.: Reachability analysis for neural feed- back systems using regressive polynomial rule inference. In: Proceedings of the 22nd ACM International Conference on Hybrid Systems: Computation and Con- trol (HSCC19). pp. 157–168. ACM (2019)

  10. [10]

    Dvijotham, K., Stanforth, R., Gowal, S., Mann, T.A., Kohli, P.: A dual approach to scalable verification of deep networks. In: UAI. vol. 1, p. 3 (2018)

  11. [11]

    Fawzi, A., Fawzi, O., Frossard, P.: Analysis of classifiers’ robustness to adversar- ial perturbations. Mach. Learn.107(3), 481–508 (2018). https://doi.org/10.1007/ S10994-017-5663-3, https://doi.org/10.1007/s10994-017-5663-3 30 S. Varghese et al

  12. [12]

    In: Advances in Neural Information Processing Systems (NeurIPS) (2019)

    Fazlyab,M.,Morari,M.,Pappas,G.J.:Efficientandaccurateestimationoflipschitz constants for deep neural networks. In: Advances in Neural Information Processing Systems (NeurIPS) (2019)

  13. [13]

    In: 2018 IEEE Symposium on Security and Privacy (SP)

    Gehr, T., Mirman, M., Drachsler-Cohen, D., Tsankov, P., Chaudhuri, S., Vechev, M.: Ai2: Safety and robustness certification of neural networks with abstract in- terpretation. In: 2018 IEEE Symposium on Security and Privacy (SP). pp. 3–18 (2018). https://doi.org/10.1109/SP.2018.00058

  14. [15]

    In: Bengio, Y., LeCun, Y

    Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learn- ing Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings (2015), http://arxiv.org/abs/1412.6572

  15. [16]

    ArXivabs/1810.12715(2018), https://api.semanticscholar.org/CorpusID:53112003

    Gowal, S., Dvijotham, K., Stanforth, R., Bunel, R., Qin, C., Uesato, J., Arand- jelović, R., Mann, T.A., Kohli, P.: On the effectiveness of interval bound prop- agation for training verifiably robust models. ArXivabs/1810.12715(2018), https://api.semanticscholar.org/CorpusID:53112003

  16. [17]

    In: Advances in Neural Information Processing Systems (NeurIPS) (2017)

    Hein, M., Andriushchenko, M.: Formal guarantees on the robustness of a classifier against adversarial manipulation. In: Advances in Neural Information Processing Systems (NeurIPS) (2017)

  17. [18]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

    Hingun, N., Sitawarin, C., Li, J., Wagner, D.: Reap: A large-scale realistic adversar- ial patch benchmark. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 4640–4651 (October 2023)

  18. [19]

    In: International conference on machine learning (ICML) Workshop on Formal Verification of Machine Learning (WFVML) (2023)

    Hu, H., Liu, C., Zhao, D.: Robustness verification for perception models against camera motion perturbations. In: International conference on machine learning (ICML) Workshop on Formal Verification of Machine Learning (WFVML) (2023)

  19. [20]

    INFORMS Journal on Computing (2026)

    Huchette, J., Muñoz, G., Serra, T., Tsay, C.: When deep learning meets polyhedral theory: A survey. INFORMS Journal on Computing (2026)

  20. [21]

    In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A

    Jordan, M., Hayase, J., Dimakis, A., Oh, S.: Zonotope domains for la- grangian neural network verification. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Process- ing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 -...

  21. [22]

    In: Computer Aided Ver- ification: 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part I 30

    Katz, G., Barrett, C., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: An efficient smt solver for verifying deep neural networks. In: Computer Aided Ver- ification: 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part I 30. pp. 97–117. Springer (2017)

  22. [23]

    In: Computer Aided Verification: 31st In- ternational Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Proceedings, Part I 31

    Katz, G., Huang, D.A., Ibeling, D., Julian, K., Lazarus, C., Lim, R., Shah, P., Thakoor, S., Wu, H., Zeljić, A., et al.: The marabou framework for verification and analysis of deep neural networks. In: Computer Aided Verification: 31st In- ternational Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Proceedings, Part I 31. pp. 443–452. Spri...

  23. [24]

    Energy308, 132885 (2024) Hybrid Robustness Verification for Spatio-Temporal Neural Networks 31

    Lambert, J., Ceruti, A., Spliethoff, H.: Benchmark of mixed-integer linear pro- gramming formulations for district heating network design. Energy308, 132885 (2024) Hybrid Robustness Verification for Spatio-Temporal Neural Networks 31

  24. [25]

    Advances in Neural Information Processing Systems 34, 10171–10185 (2021)

    Lechner, M., Žikelić, Ð., Chatterjee, K., Henzinger, T.: Infinite time horizon safety of bayesian neural networks. Advances in Neural Information Processing Systems 34, 10171–10185 (2021)

  25. [26]

    LeCun, Y., Cortes, C., Burges, C.J.: The mnist database of handwritten digits (1998)

  26. [27]

    IEEE Transactions on Dependable and Secure Computing21(4), 4110–4121 (2023)

    Lee, H.J., Ro, Y.M.: Defending video recognition model against adversarial per- turbations via defense patterns. IEEE Transactions on Dependable and Secure Computing21(4), 4110–4121 (2023)

  27. [28]

    Knowledge-based sys- tems203, 106145 (2020)

    Lee, I.G., Zhang, Q., Yoon, S.W., Won, D.: A mixed integer linear programming support vector machine for cost-effective feature selection. Knowledge-based sys- tems203, 106145 (2020)

  28. [29]

    Transportation Research Part E: Logistics and Transportation Review142, 102059 (2020)

    Lin, Y.H., Wang, Y., He, D., Lee, L.H.: Last-mile delivery: Optimal locker location under multinomial logit choice model. Transportation Research Part E: Logistics and Transportation Review142, 102059 (2020)

  29. [30]

    CoRRabs/1706.07351(2017), http://arxiv.org/abs/1706

    Lomuscio, A., Maganti, L.: An approach to reachability analysis for feed-forward relu neural networks. CoRRabs/1706.07351(2017), http://arxiv.org/abs/1706. 07351

  30. [31]

    Manmatha, Alexander J

    Luo, W., Yang, B., Urtasun, R.: Fast and furious: Real time end-to-end 3d de- tection, tracking and motion forecasting with a single convolutional net. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018. pp. 3569–3577. Computer Vision Foun- dation / IEEE Computer Society (2018). https:...

  31. [32]

    In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S

    Mao, Y., Müller, M.N., Fischer, M., Vechev, M.T.: Connecting certi- fied and adversarial training. In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. (eds.) Advances in Neural Informa- tion Processing Systems 36: Annual Conference on Neural Information Pro- cessing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, Decem- ber 10 - 16...

  32. [33]

    Massena, T., Friedrich, C., Mamalet, F., Serrurier, M.: Fast and flexible robustness certificates for semantic segmentation (2025), https://arxiv.org/abs/2512.06010

  33. [34]

    In: The Eleventh International Conference on Learning Repre- sentations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023

    Müller, M.N., Eckert, F., Fischer, M., Vechev, M.T.: Certified training: Small boxes are all you need. In: The Eleventh International Conference on Learning Repre- sentations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net (2023), https://openreview.net/forum?id=7oFuxtJtUMH

  34. [35]

    In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019

    Mummadi,C.K.,Brox,T.,Metzen,J.H.:Defendingagainstuniversalperturbations with shared adversarial training. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019. pp. 4927–4936. IEEE (2019). https://doi.org/10.1109/ICCV.2019.00503, https://doi.org/10.1109/ICCV.2019.00503

  35. [36]

    In: Proceedings of the IEEE/ACM 47th International Conference on Software Engineering (2025)

    Sasaki, S., Lopez, D.M., Robinette, P.K., Johnson, T.T.: Robustness verification of video classification neural networks. In: Proceedings of the IEEE/ACM 47th International Conference on Software Engineering (2025)

  36. [37]

    In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R

    Singh, G., Gehr, T., Mirman, M., Püschel, M., Vechev, M.: Fast and effective robustness certification. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems. vol. 31. Curran Associates, Inc. (2018), https://proceedings.neurips.cc/ paper_files/paper/2018/file/f2f4469...

  37. [38]

    Proceedings of the ACM on Programming Languages3(POPL), 1–30 (2019)

    Singh, G., Gehr, T., Püschel, M., Vechev, M.: An abstract domain for certifying neural networks. Proceedings of the ACM on Programming Languages3(POPL), 1–30 (2019)

  38. [39]

    In: International Conference on Machine Learning (ICML) (2021)

    Singla, S., Feizi, S.: Skew orthogonal convolutions. In: International Conference on Machine Learning (ICML) (2021)

  39. [40]

    CoRRabs/1212.0402(2012), http://arxiv.org/ abs/1212.0402

    Soomro, K., Zamir, A.R., Shah, M.: UCF101: A dataset of 101 human actions classes from videos in the wild. CoRRabs/1212.0402(2012), http://arxiv.org/ abs/1212.0402

  40. [41]

    arXiv preprint arXiv:2602.16944 (2026)

    Sosnin, P., Knapp, J., Kennedy, F., Collyer, J., Tsay, C.: Exact certifica- tion of data-poisoning attacks using mixed-integer programming. arXiv preprint arXiv:2602.16944 (2026)

  41. [42]

    Transactions on Machine Learning Research (TMLR) (2024)

    Sosnin, P., Müller, M.N., Baader, M., Tsay, C., Wicker, M.: Certified robustness to data poisoning in gradient-based training. Transactions on Machine Learning Research (TMLR) (2024)

  42. [43]

    arXiv preprint arXiv:2511.09400 (2025)

    Sosnin, P., Wicker, M., Collyer, J., Tsay, C.: Abstract gradient training: A unified certification framework for data poisoning, unlearning, and differential privacy. arXiv preprint arXiv:2511.09400 (2025)

  43. [44]

    In: The 2011 Inter- national Joint Conference on Neural Networks

    Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: The german traffic sign recog- nition benchmark: A multi-class classification competition. In: The 2011 Inter- national Joint Conference on Neural Networks. pp. 1453–1460 (2011). https: //doi.org/10.1109/IJCNN.2011.6033395

  44. [45]

    arXiv preprint arXiv:1312.6199 (2013)

    Szegedy, C.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)

  45. [46]

    In: International Conference on Learning Representations (ICLR) (2014)

    Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. In: International Conference on Learning Representations (ICLR) (2014)

  46. [47]

    arXiv preprint arXiv:2502.12524 (2025)

    Tian, Y., Ye, Q., Doermann, D.: Yolov12: Attention-centric real-time object detec- tors. arXiv preprint arXiv:2502.12524 (2025)

  47. [49]

    In: International Conference on Learning Represen- tations (2019), https://openreview.net/forum?id=HyGIdiRqtm

    Tjeng, V., Xiao, K.Y., Tedrake, R.: Evaluating robustness of neural networks with mixed integer programming. In: International Conference on Learning Represen- tations (2019), https://openreview.net/forum?id=HyGIdiRqtm

  48. [50]

    In: Wallach, H., Larochelle, H., Beygelzimer, A., d'Alché-Buc, F., Fox, E., Garnett, R

    Tramer, F., Boneh, D.: Adversarial training and robustness for multiple perturba- tions. In: Wallach, H., Larochelle, H., Beygelzimer, A., d'Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems. vol. 32. Curran Associates, Inc. (2019), https://proceedings.neurips.cc/paper_files/paper/ 2019/file/5d4ae76f053f8f2516ad1...

  49. [51]

    In: Proceedings of the 32nd International Con- ference on Computer-Aided Verification (CAV)

    Tran, H.D., Bak, S., Xiang, W., Johnson, T.T.: Verification of deep convolutional neural networks using imagestars. In: Proceedings of the 32nd International Con- ference on Computer-Aided Verification (CAV). Springer (2020)

  50. [52]

    In: Advances in Neural Information Processing Systems (NeurIPS) (2018)

    Tsuzuku, Y., Sato, I., Sugiyama, M.: Lipschitz-margin training: Scalable certifica- tion of perturbation invariance for deep neural networks. In: Advances in Neural Information Processing Systems (NeurIPS) (2018)

  51. [53]

    Udacity, A.: Udacity self-driving car dataset (2017)

  52. [54]

    In: Advances in Neural Information Processing Systems (NeurIPS) (2018) Hybrid Robustness Verification for Spatio-Temporal Neural Networks 33

    Virmaux, A., Scaman, K.: Lipschitz regularity of deep neural networks: analysis and efficient estimation. In: Advances in Neural Information Processing Systems (NeurIPS) (2018) Hybrid Robustness Verification for Spatio-Temporal Neural Networks 33

  53. [55]

    In: Williams, B., Chen, Y., Neville, J

    Wang, F., Xu, P., Ruan, W., Huang, X.: Towards verifying the geometric ro- bustness of large-scale neural networks. In: Williams, B., Chen, Y., Neville, J. (eds.) Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023, Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence, IAAI 2023, Thirteenth Symposium on Educationa...

  54. [56]

    CoRRabs/2402.15469(2024)

    Wang, Y., Zhao, H., Gummadi, D., Dianati, M., Debattista, K., Donzella, V.: Benchmarking the robustness of panoptic segmentation for automated driving. CoRRabs/2402.15469(2024). https://doi.org/10.48550/ARXIV.2402.15469, https://doi.org/10.48550/arXiv.2402.15469

  55. [57]

    In: International Conference on Tools and Algorithms for the Construction and Analysis of Systems

    Wicker, M., Huang, X., Kwiatkowska, M.: Feature-guided black-box safety testing of deep neural networks. In: International Conference on Tools and Algorithms for the Construction and Analysis of Systems. pp. 408–426. Springer (2018)

  56. [58]

    In: Uncertainty in Artificial Intelligence

    Wicker, M., Laurenti, L., Patane, A., Paoletti, N., Abate, A., Kwiatkowska, M.: Certification of iterative predictions in bayesian neural networks. In: Uncertainty in Artificial Intelligence. pp. 1713–1723. PMLR (2021)

  57. [59]

    Artificial Intelligence334, 104132 (2024)

    Wicker, M., Laurenti, L., Patane, A., Paoletti, N., Abate, A., Kwiatkowska, M.: Probabilistic reach-avoid for bayesian neural networks. Artificial Intelligence334, 104132 (2024)

  58. [60]

    In: International Conference on Machine Learning

    Wicker, M.R., Sosnin, P., Shilov, I., Janik, A., Mueller, M.N., De Montjoye, Y.A., Weller, A., Tsay, C.: Certification for differentially private prediction in gradient- based training. In: International Conference on Machine Learning. pp. 66726– 66745. PMLR (2025)

  59. [61]

    In: International conference on machine learning

    Wong, E., Kolter, Z.: Provable defenses against adversarial examples via the convex outer adversarial polytope. In: International conference on machine learning. pp. 5286–5295. PMLR (2018)

  60. [62]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

    Wu, M., Kwiatkowska, M.: Robustness guarantees for deep neural networks on videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 308–317 (2020). https://doi.org/10.1109/ CVPR42600.2020.00039

  61. [63]

    Theoretical Computer Science807, 298–329 (2020)

    Wu, M., Wicker, M., Ruan, W., Huang, X., Kwiatkowska, M.: A game-based approximate verification of deep neural networks with provable guarantees. Theoretical Computer Science807, 298–329 (2020). https://doi.org/https://doi. org/10.1016/j.tcs.2019.05.046, https://www.sciencedirect.com/science/article/pii/ S0304397519304426

  62. [64]

    CoRR abs/2408.08456(2024)

    Wu, Y., Chen, H., Makki, A.P., Nguyen, P., Yesha, Y.: Efficient data-sketches and fine-tuning for early detection of distributional drift in medical imaging. CoRR abs/2408.08456(2024). https://doi.org/10.48550/ARXIV.2408.08456, https:// doi.org/10.48550/arXiv.2408.08456

  63. [65]

    CoRRabs/2110.14795(2021), https://arxiv.org/abs/2110.14795

    Yang,J.,Shi,R.,Wei,D.,Liu,Z.,Zhao,L.,Ke,B.,Pfister,H.,Ni,B.:Medmnistv2: A large-scale lightweight benchmark for 2d and 3d biomedical image classification. CoRRabs/2110.14795(2021), https://arxiv.org/abs/2110.14795

  64. [66]

    In: Advances in Neural Information Pro- cessing Systems (NeurIPS)

    Zhang, Y., Kouvaros, P., Lomuscio, A.: Scalable neural network geometric robust- ness validation via Hölder optimisation. In: Advances in Neural Information Pro- cessing Systems (NeurIPS). vol. 38 (2025), https://proceedings.neurips.cc/paper_ files/paper/2025/hash/1435ac9924d7621e44aa2407a1cfdec7-Abstract-Conference. html