Certified Training with Branch-and-Bound for Lyapunov-stable Neural Control

Cho-Jui Hsieh; Haoyu Li; Huan Zhang; Zhouxing Shi

arxiv: 2411.18235 · v3 · submitted 2024-11-27 · 💻 cs.LG · cs.AI· cs.RO· cs.SY· eess.SY

Certified Training with Branch-and-Bound for Lyapunov-stable Neural Control

Zhouxing Shi , Haoyu Li , Cho-Jui Hsieh , Huan Zhang This is my paper

Pith reviewed 2026-05-23 16:39 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.ROcs.SYeess.SY

keywords neural controlLyapunov stabilitycertified trainingbranch-and-boundregion of attractionneural network verificationoutput feedback control

0 comments

The pith

Incorporating branch-and-bound splitting into the training loop produces neural controllers whose Lyapunov stability can be certified over much larger regions and verified much faster.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents CT-BaB, a training procedure that interleaves bound computation with optimization so that the neural controller is shaped by the same certified bounds used later for verification. It maintains a dynamic set of input regions and repeatedly splits the hardest ones to tighten the lower bounds on the Lyapunov decrease condition. Because the splits created during training are reused at test time, the verification procedure avoids redundant work and produces tighter overall guarantees. On the largest output-feedback 2D quadrotor task examined, the resulting controllers admit regions of attraction 164 times larger than those obtained by prior certified training while cutting verification time by more than a factor of eleven. A reader would care because the method directly attacks the mismatch between what is optimized during learning and what must be proved afterward for safety-critical control.

Core claim

CT-BaB is a certified training framework that optimizes certified bounds on the Lyapunov asymptotic stability condition by running branch-and-bound at training time. It maintains a dynamic training dataset of input subregions and adaptively splits hard subregions into smaller ones, tightening the certified bounds while easing the training objective. The same subregions are then used to guide a training-aware verification procedure at test time, yielding models that are both easier to verify and able to certify stability over larger regions of attraction.

What carries the argument

Training-time branch-and-bound that adaptively splits hard input subregions to tighten certified Lyapunov bounds and reuses those splits for test-time verification.

If this is right

Models trained with CT-BaB admit larger certified regions of attraction than models trained with counterexample-guided methods.
Test-time verification reuses the training-time splits and therefore runs substantially faster.
The discrepancy between the bounds optimized during training and the bounds computed at verification is reduced.
The same adaptive splitting technique informs both the training objective and the final verification procedure.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The reuse of training splits could be applied to other bound-based verification tasks where the same regions must be certified repeatedly.
If the splitting criterion were changed to target different stability margins, the framework might produce controllers certified for stricter performance requirements.
Maintaining the dynamic dataset of subregions might generalize to settings where the input domain changes over time, such as online adaptation.

Load-bearing premise

That adaptively splitting hard input subregions during training produces certified bounds that stay tight and informative at test time without introducing new looseness or prohibitive extra cost.

What would settle it

Running CT-BaB on the output-feedback 2D quadrotor system and finding that verification time is not reduced by at least a factor of eleven while the certified region of attraction fails to grow by at least 164 times.

Figures

Figures reproduced from arXiv: 2411.18235 by Cho-Jui Hsieh, Haoyu Li, Huan Zhang, Zhouxing Shi.

**Figure 1.** Figure 1: Visualization of the Lyapunov function (color plots) and ROA (contours) on the 2D quadrotor system with three [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗

**Figure 2.** Figure 2: Visualization for the distribution of subregions in [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗

**Figure 3.** Figure 3: Visualization for the distribution of subregions in [PITH_FULL_IMAGE:figures/full_fig_p015_3.png] view at source ↗

**Figure 4.** Figure 4: Visualization for the distribution of subregions in [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗

read the original abstract

We study the problem of learning verifiably Lyapunov-stable neural controllers that provably satisfy the Lyapunov asymptotic stability condition within a region-of-attraction (ROA). Unlike previous works that adopted counterexample-guided training without considering the computation of verification in training, we introduce Certified Training with Branch-and-Bound (CT-BaB), a new certified training framework that optimizes certified bounds, thereby reducing the discrepancy between training and test-time verification that also computes certified bounds. To achieve a relatively global guarantee on an entire input region-of-interest, we propose a training-time BaB technique that maintains a dynamic training dataset and adaptively splits hard input subregions into smaller ones, to tighten certified bounds and ease the training. Meanwhile, subregions created by the training-time BaB also inform test-time verification, for a more efficient training-aware verification. We demonstrate that CT-BaB yields verification-friendly models that can be more efficiently verified at test time while achieving stronger verifiable guarantees with larger ROA. On the largest output-feedback 2D Quadrotor system experimented, CT-BaB reduces verification time by over 11X relative to the previous state-of-the-art baseline using Counterexample Guided Inductive Synthesis (CEGIS), while achieving 164X larger ROA. Code is available at https://github.com/shizhouxing/CT-BaB.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CT-BaB runs branch-and-bound during training to build a dynamic partition that is reused at verification time, producing the reported speedups and larger ROA on the quadrotor example.

read the letter

The new piece is the training-time BaB loop that adaptively splits difficult subregions, keeps a dynamic dataset of those splits, and feeds the same partition into test-time verification. This is meant to shrink the usual gap between what the trainer optimizes and what the verifier can certify. On the output-feedback 2D quadrotor they report 11X faster verification and 164X larger ROA than the CEGIS baseline they compare against. That gap is large enough to be worth noticing if it holds up under closer inspection of the bound rules and the splitting schedule. The abstract makes clear that the same subregions created in training are reused for verification, which is the concrete mechanism they claim reduces looseness and cost. The code release is also useful for anyone who wants to check the implementation details. The main soft spot is that the headline numbers come from one primary system; it is not yet obvious how sensitive the gains are to the choice of relaxation, the exact termination criteria for splitting, or to higher-dimensional cases. If the adaptive splits introduce extra subregions that must still be checked at test time, the net speedup could shrink. The paper does not appear to overclaim the numbers as general, but the link between training splits and final certified tightness needs the full experimental protocol to evaluate. People working on verified neural controllers for robotics will want to read this. It is grounded enough in a concrete algorithmic change and a reproducible baseline comparison to merit peer review rather than desk rejection.

Referee Report

2 major / 1 minor

Summary. The paper proposes Certified Training with Branch-and-Bound (CT-BaB) for learning neural controllers satisfying Lyapunov asymptotic stability within a region-of-attraction (ROA). Unlike prior CEGIS-based methods, CT-BaB integrates branch-and-bound into training via a dynamic dataset that adaptively splits hard input subregions to tighten certified bounds; the resulting partitions are reused to accelerate test-time verification. Experiments on control systems, including an output-feedback 2D quadrotor, report >11X verification speedup and 164X larger ROA relative to the CEGIS baseline.

Significance. If the central empirical claims hold, the work would meaningfully narrow the train-test gap in certified neural control by producing models whose certified bounds remain tight and cheap to compute when the training-time partition is reused at verification time. Code release aids reproducibility of the reported speed-ups and ROA factors.

major comments (2)

[Abstract] Abstract: the headline claim that CT-BaB yields 11X faster verification and 164X larger ROA on the output-feedback 2D quadrotor rests on the training-time adaptive splits producing bounds that remain tight when reused at test time. The abstract supplies neither the precise bound-propagation rules nor the splitting termination criteria, so it is impossible to assess whether the dynamic partition introduces new sources of looseness or inflates the number of subregions that must be verified.
[Method (training-time BaB)] The description of the training-time BaB procedure (dynamic dataset maintenance and adaptive splitting) does not contain an explicit argument or ablation showing that the particular relaxation chosen preserves tightness across the train-test transition; this link is load-bearing for the claim that the method reduces the discrepancy between training and test-time verification.

minor comments (1)

All experimental protocols (bound-computation details, data splits, termination criteria for splitting) should be stated with sufficient precision to allow independent reproduction of the reported timing and ROA numbers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments point-by-point below and indicate the revisions we will make to strengthen the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the headline claim that CT-BaB yields 11X faster verification and 164X larger ROA on the output-feedback 2D quadrotor rests on the training-time adaptive splits producing bounds that remain tight when reused at test time. The abstract supplies neither the precise bound-propagation rules nor the splitting termination criteria, so it is impossible to assess whether the dynamic partition introduces new sources of looseness or inflates the number of subregions that must be verified.

Authors: We agree that the abstract should be more self-contained. In the revision we will expand it to state that (i) the same linear relaxations are used for bound propagation in both training and verification and (ii) splitting terminates when the certified bound on each subregion falls below a tightness threshold or a maximum depth is reached. These additions will clarify that the dynamic partition does not introduce new looseness beyond what is already present in the underlying relaxation. revision: yes
Referee: [Method (training-time BaB)] The description of the training-time BaB procedure (dynamic dataset maintenance and adaptive splitting) does not contain an explicit argument or ablation showing that the particular relaxation chosen preserves tightness across the train-test transition; this link is load-bearing for the claim that the method reduces the discrepancy between training and test-time verification.

Authors: We will add an explicit paragraph in Section 3.2 explaining that the training-time procedure re-uses exactly the same bound-propagation rules and termination criteria as the test-time verifier; the only difference is that the partition is discovered adaptively during training rather than fixed in advance. We will also include a short ablation that reports the certified ROA size when the learned partition is replaced by a uniform grid of the same cardinality, thereby quantifying the tightness benefit of the training-aware partition. revision: yes

Circularity Check

0 steps flagged

No significant circularity in CT-BaB derivation or claims

full rationale

The paper introduces an algorithmic change: incorporating adaptive BaB splitting into the training loop to produce verification-friendly models whose certified Lyapunov bounds are then reused at test time. The reported gains (11X verification speedup, 164X larger ROA) are presented as empirical outcomes of this procedure rather than quantities defined by construction from fitted parameters or prior self-citations. No self-definitional steps, fitted-input predictions, or load-bearing self-citation chains appear in the abstract or described method; the training-test alignment is an explicit design goal, not a tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The method rests on standard domain assumptions about the computability of certified bounds for neural networks and Lyapunov functions; no new free parameters or invented entities are described in the abstract.

axioms (1)

domain assumption Lyapunov asymptotic stability can be certified via computable bounds on neural-network controllers inside a region of attraction
Invoked throughout the abstract as the target property that CT-BaB optimizes.

pith-pipeline@v0.9.0 · 5786 in / 1279 out tokens · 30704 ms · 2026-05-23T16:39:11.237788+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

we propose a training-time branch-and-bound technique that maintains a dynamic training dataset and adaptively splits hard input subregions
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

optimize for verified bounds on subregions of inputs

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages

[1]

Cora 2016 manual

Matthias Althoff and Niklas Kochdumper. Cora 2016 manual. TU Munich, 85748,

work page 2016
[2]

Hamilton-jacobi reachability: A brief overview and recent advances

Somil Bansal, Mo Chen, Sylvia Herbert, and Claire J Tomlin. Hamilton-jacobi reachability: A brief overview and recent advances. In 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pp. 2242–2253. IEEE,

work page 2017
[3]

Branch and bound for piecewise linear neural network verification

Rudy Bunel, P Mudigonda, Ilker Turkaslan, P Torr, Jingyue Lu, and Pushmeet Kohli. Branch and bound for piecewise linear neural network verification. Journal of Machine Learning Research, 21(2020),

work page 2020
[4]

Lyapunov-stable neural-network control

Hongkai Dai, Benoit Landry, Lujie Yang, Marco Pavone, and Russ Tedrake. Lyapunov-stable neural-network control. arXiv preprint arXiv:2109.14152,

work page arXiv
[5]

Ibp regularization for verified adversarial robustness via branch-and-bound

Alessandro De Palma, Rudy Bunel, Krishnamurthy Dvijotham, M Pawan Kumar, and Robert Stanforth. Ibp regularization for verified adversarial robustness via branch-and-bound. arXiv preprint arXiv:2206.14772,

work page arXiv
[6]

On the effectiveness of interval bound propagation for training verifiably robust models

Sven Gowal, Krishnamurthy Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Timothy Mann, and Pushmeet Kohli. On the effectiveness of interval bound propagation for training verifiably robust models. arXiv preprint arXiv:1810.12715,

work page arXiv
[7]

Certified robust invariant polytope training in neural controlled odes

Akash Harapanahalli and Samuel Coogan. Certified robust invariant polytope training in neural controlled odes. arXiv preprint arXiv:2408.01273,

work page arXiv
[8]

Efficient neural network verification via adaptive refinement and adversarial search

Patrick Henriksen and Alessio Lomuscio. Efficient neural network verification via adaptive refinement and adversarial search. In ECAI 2020, pp. 2513–2520. IOS Press,

work page 2020
[9]

Reach-sdp: Reachability analysis of closed-loop systems with neural network controllers via semidefinite programming

Haimin Hu, Mahyar Fazlyab, Manfred Morari, and George J Pappas. Reach-sdp: Reachability analysis of closed-loop systems with neural network controllers via semidefinite programming. In 2020 59th IEEE conference on decision and control (CDC), pp. 5929–5934. IEEE,

work page 2020
[10]

Neural certificates for safe control policies

Wanxin Jin, Zhaoran Wang, Zhuoran Yang, and Shaoshuai Mou. Neural certificates for safe control policies. arXiv preprint arXiv:2006.08465,

work page arXiv 2006
[11]

On the paradox of certified training.arXiv preprint arXiv:2102.06700,

Nikola Jovanovi´c, Mislav Balunovi´c, Maximilian Baader, and Martin Vechev. On the paradox of certified training.arXiv preprint arXiv:2102.06700,

work page arXiv
[12]

Control design along trajectories with sums of squares programming

Anirudha Majumdar, Amir Ali Ahmadi, and Russ Tedrake. Control design along trajectories with sums of squares programming. In 2013 IEEE International Conference on Robotics and Automation, pp. 4054–4061. IEEE,

work page 2013
[13]

Wikipedia contributors

Mark Niklas Müller, Franziska Eckert, Marc Fischer, and Martin Vechev. Certified training: Small boxes are all you need. arXiv preprint arXiv:2210.04871,

work page arXiv
[14]

Neural network verification with branch-and-bound for general nonlinearities

Zhouxing Shi, Qirui Jin, Zico Kolter, Suman Jana, Cho-Jui Hsieh, and Huan Zhang. Neural network verification with branch-and-bound for general nonlinearities. arXiv preprint arXiv:2405.21063,

work page arXiv
[15]

Provably safe neural network controllers via differential dynamic logic

Samuel Teuber, Stefan Mitsch, and André Platzer. Provably safe neural network controllers via differential dynamic logic. arXiv preprint arXiv:2402.10998,

work page arXiv
[16]

Beta-crown: Efficient bound propagation with per-neuron split constraints for neural network robustness verification

Shiqi Wang, Huan Zhang, Kaidi Xu, Xue Lin, Suman Jana, Cho-Jui Hsieh, and J Zico Kolter. Beta-crown: Efficient bound propagation with per-neuron split constraints for neural network robustness verification. Advances in Neural Information Processing Systems, 34:29909–29921, 2021a. Xinyu Wang, Luzia Knoedler, Frederik Baymler Mathiesen, and Javier Alonso-Mo...

work page 2024
[17]

General cutting planes for bound-propagation-based neural network verification

12 Huan Zhang, Shiqi Wang, Kaidi Xu, Linyi Li, Bo Li, Suman Jana, Cho-Jui Hsieh, and J Zico Kolter. General cutting planes for bound-propagation-based neural network verification. arXiv preprint arXiv:2208.05740,

work page arXiv
[18]

(2024) (we follow their source code which has some minor difference with the information provided in their paper)

13 A Details of the Implementation and Experiments We directly adopt the model architecture of all the controllers and Lyapunov functions from Yang et al. (2024) (we follow their source code which has some minor difference with the information provided in their paper). The controller is always a fully-connected NN with 8 hidden neurons in each hidden laye...

work page 2024
[19]

10 5 0 5 10 (rad) 10 5 0 5 10 (rad/s) 25000 50000 75000 100000 125000 150000 175000 200000 (b) Small torque

14 10 5 0 5 10 (rad) 10 5 0 5 10 (rad/s) 250 500 750 1000 1250 1500 1750 (a) Large torque. 10 5 0 5 10 (rad) 10 5 0 5 10 (rad/s) 25000 50000 75000 100000 125000 150000 175000 200000 (b) Small torque. Figure 2: Visualization for the distribution of subregions in D at the end of the training for the inverted pendulum system, with large torque limit and smal...

work page 2000

[1] [1]

Cora 2016 manual

Matthias Althoff and Niklas Kochdumper. Cora 2016 manual. TU Munich, 85748,

work page 2016

[2] [2]

Hamilton-jacobi reachability: A brief overview and recent advances

Somil Bansal, Mo Chen, Sylvia Herbert, and Claire J Tomlin. Hamilton-jacobi reachability: A brief overview and recent advances. In 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pp. 2242–2253. IEEE,

work page 2017

[3] [3]

Branch and bound for piecewise linear neural network verification

Rudy Bunel, P Mudigonda, Ilker Turkaslan, P Torr, Jingyue Lu, and Pushmeet Kohli. Branch and bound for piecewise linear neural network verification. Journal of Machine Learning Research, 21(2020),

work page 2020

[4] [4]

Lyapunov-stable neural-network control

Hongkai Dai, Benoit Landry, Lujie Yang, Marco Pavone, and Russ Tedrake. Lyapunov-stable neural-network control. arXiv preprint arXiv:2109.14152,

work page arXiv

[5] [5]

Ibp regularization for verified adversarial robustness via branch-and-bound

Alessandro De Palma, Rudy Bunel, Krishnamurthy Dvijotham, M Pawan Kumar, and Robert Stanforth. Ibp regularization for verified adversarial robustness via branch-and-bound. arXiv preprint arXiv:2206.14772,

work page arXiv

[6] [6]

On the effectiveness of interval bound propagation for training verifiably robust models

Sven Gowal, Krishnamurthy Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Timothy Mann, and Pushmeet Kohli. On the effectiveness of interval bound propagation for training verifiably robust models. arXiv preprint arXiv:1810.12715,

work page arXiv

[7] [7]

Certified robust invariant polytope training in neural controlled odes

Akash Harapanahalli and Samuel Coogan. Certified robust invariant polytope training in neural controlled odes. arXiv preprint arXiv:2408.01273,

work page arXiv

[8] [8]

Efficient neural network verification via adaptive refinement and adversarial search

Patrick Henriksen and Alessio Lomuscio. Efficient neural network verification via adaptive refinement and adversarial search. In ECAI 2020, pp. 2513–2520. IOS Press,

work page 2020

[9] [9]

Reach-sdp: Reachability analysis of closed-loop systems with neural network controllers via semidefinite programming

Haimin Hu, Mahyar Fazlyab, Manfred Morari, and George J Pappas. Reach-sdp: Reachability analysis of closed-loop systems with neural network controllers via semidefinite programming. In 2020 59th IEEE conference on decision and control (CDC), pp. 5929–5934. IEEE,

work page 2020

[10] [10]

Neural certificates for safe control policies

Wanxin Jin, Zhaoran Wang, Zhuoran Yang, and Shaoshuai Mou. Neural certificates for safe control policies. arXiv preprint arXiv:2006.08465,

work page arXiv 2006

[11] [11]

On the paradox of certified training.arXiv preprint arXiv:2102.06700,

Nikola Jovanovi´c, Mislav Balunovi´c, Maximilian Baader, and Martin Vechev. On the paradox of certified training.arXiv preprint arXiv:2102.06700,

work page arXiv

[12] [12]

Control design along trajectories with sums of squares programming

Anirudha Majumdar, Amir Ali Ahmadi, and Russ Tedrake. Control design along trajectories with sums of squares programming. In 2013 IEEE International Conference on Robotics and Automation, pp. 4054–4061. IEEE,

work page 2013

[13] [13]

Wikipedia contributors

Mark Niklas Müller, Franziska Eckert, Marc Fischer, and Martin Vechev. Certified training: Small boxes are all you need. arXiv preprint arXiv:2210.04871,

work page arXiv

[14] [14]

Neural network verification with branch-and-bound for general nonlinearities

Zhouxing Shi, Qirui Jin, Zico Kolter, Suman Jana, Cho-Jui Hsieh, and Huan Zhang. Neural network verification with branch-and-bound for general nonlinearities. arXiv preprint arXiv:2405.21063,

work page arXiv

[15] [15]

Provably safe neural network controllers via differential dynamic logic

Samuel Teuber, Stefan Mitsch, and André Platzer. Provably safe neural network controllers via differential dynamic logic. arXiv preprint arXiv:2402.10998,

work page arXiv

[16] [16]

Beta-crown: Efficient bound propagation with per-neuron split constraints for neural network robustness verification

Shiqi Wang, Huan Zhang, Kaidi Xu, Xue Lin, Suman Jana, Cho-Jui Hsieh, and J Zico Kolter. Beta-crown: Efficient bound propagation with per-neuron split constraints for neural network robustness verification. Advances in Neural Information Processing Systems, 34:29909–29921, 2021a. Xinyu Wang, Luzia Knoedler, Frederik Baymler Mathiesen, and Javier Alonso-Mo...

work page 2024

[17] [17]

General cutting planes for bound-propagation-based neural network verification

12 Huan Zhang, Shiqi Wang, Kaidi Xu, Linyi Li, Bo Li, Suman Jana, Cho-Jui Hsieh, and J Zico Kolter. General cutting planes for bound-propagation-based neural network verification. arXiv preprint arXiv:2208.05740,

work page arXiv

[18] [18]

(2024) (we follow their source code which has some minor difference with the information provided in their paper)

13 A Details of the Implementation and Experiments We directly adopt the model architecture of all the controllers and Lyapunov functions from Yang et al. (2024) (we follow their source code which has some minor difference with the information provided in their paper). The controller is always a fully-connected NN with 8 hidden neurons in each hidden laye...

work page 2024

[19] [19]

10 5 0 5 10 (rad) 10 5 0 5 10 (rad/s) 25000 50000 75000 100000 125000 150000 175000 200000 (b) Small torque

14 10 5 0 5 10 (rad) 10 5 0 5 10 (rad/s) 250 500 750 1000 1250 1500 1750 (a) Large torque. 10 5 0 5 10 (rad) 10 5 0 5 10 (rad/s) 25000 50000 75000 100000 125000 150000 175000 200000 (b) Small torque. Figure 2: Visualization for the distribution of subregions in D at the end of the training for the inverted pendulum system, with large torque limit and smal...

work page 2000