Chance-Constrained Neural MPC under Uncontrollable Agents via Sequential Convex Programming

Mingyang Feng; Shuqi Wang; Xiang Yin; Yu Chen; Yue Gao

arxiv: 2504.03293 · v3 · submitted 2025-04-04 · 📡 eess.SY · cs.SY

Chance-Constrained Neural MPC under Uncontrollable Agents via Sequential Convex Programming

Shuqi Wang , Mingyang Feng , Yu Chen , Yue Gao , Xiang Yin This is my paper

Pith reviewed 2026-05-22 21:12 UTC · model grok-4.3

classification 📡 eess.SY cs.SY

keywords neural MPCconformal predictionchance constraintssequential convex programmingautonomous drivinguncontrollable agentsdistribution shiftprobabilistic safety

0 comments

The pith

A neural MPC method with conformal prediction bounds creates formal probabilistic safety guarantees around stochastic uncontrollable agents even after policy shifts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a neural model predictive control framework for systems that must interact safely with agents whose behavior is stochastic and depends on the controlled system's own actions. A predictor is learned from offline data, then a region-wise robust conformal prediction scheme builds time-dependent uncertainty sets that remain valid under the distribution shifts caused by the closed-loop policy. These sets enter the MPC as chance constraints. A two-loop sequential convex programming procedure solves the resulting non-convex program: the inner loop optimizes with fixed bounds while the outer loop updates the bounds from the new control sequence. Convergence of the algorithm is shown, and the approach is demonstrated in autonomous driving among interactive pedestrians.

Core claim

By embedding region-wise robust conformal prediction into neural MPC, the framework obtains valid probabilistic bounds on prediction errors despite policy-induced distribution shifts; the two-loop SCP algorithm then converges to a solution that satisfies the chance constraints, delivering safety guarantees together with improved efficiency in multi-pedestrian driving.

What carries the argument

Region-wise robust conformal prediction scheme that produces time-dependent uncertainty bounds valid under distribution shifts, paired with a two-loop iterative sequential convex programming algorithm that alternates convex subproblem solves with bound refinement.

If this is right

The resulting controller satisfies explicit probabilistic safety constraints with respect to uncontrollable stochastic agents.
The method achieves success rates above 99.5 percent while permitting higher average speeds than baseline planners in multi-pedestrian scenarios.
Convergence of the two-loop SCP procedure is guaranteed for the class of problems considered.
The same uncertainty-bound construction applies to any learned predictor whose errors can be quantified by conformal methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same bounding technique could be applied to other learned components such as dynamics models or cost functions inside MPC.
Hardware-in-the-loop tests on real vehicles would reveal whether the computational cost of the two-loop solver remains acceptable at typical control frequencies.
The approach may extend to settings where multiple learned predictors interact, provided each predictor admits its own conformal calibration set.

Load-bearing premise

The region-wise robust conformal prediction scheme produces uncertainty bounds that stay valid under the distribution shifts induced by the closed-loop policy, and the two-loop SCP algorithm converges to a feasible point satisfying the chance constraints.

What would settle it

A controlled experiment in which the learned predictor is deployed under the closed-loop policy and the observed prediction errors exceed the conformal bounds with frequency higher than the claimed risk level, or in which the two-loop algorithm fails to produce a sequence of feasible solutions.

read the original abstract

This work investigates the challenge of ensuring safety guarantees in the presence of uncontrollable agents, whose behaviors are stochastic and depend on both their own and the system's states. We present a neural model predictive control (MPC) framework that predicts the trajectory of the uncontrollable agent using a predictor learned from offline data. To provide formal probabilistic guarantees on prediction errors despite policy-induced distribution shifts, we propose a region-wise robust conformal prediction scheme to construct time-dependent uncertainty bounds, which are integrated into the MPC formulation. To solve the resulting non-convex, discontinuous optimization problem, we propose a two-loop iterative sequential convex programming algorithm. The inner loop solves convexified subproblems with fixed error bounds, while the outer loop refines these bounds based on updated control sequences. We establish convergence guarantees and analyze the optimality of the algorithm. We illustrate our method with an autonomous driving scenario involving interactive pedestrians. Experimental results demonstrate that our approach achieves superior safety and efficiency compared to baseline methods, with success rates exceeding 99.5% while maintaining higher average speeds in multi-pedestrian scenarios.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper integrates region-wise robust conformal prediction into neural MPC with a two-loop SCP solver to handle chance constraints around stochastic agents, but the formal coverage under closed-loop shifts remains the weakest link.

read the letter

The main contribution here is a neural MPC that predicts trajectories of uncontrollable agents, wraps them with region-wise conformal bounds to get time-dependent uncertainty sets, and solves the resulting chance-constrained problem via an inner convex loop plus an outer loop that refines the bounds from the latest control sequence. The experiments in multi-pedestrian driving report success rates above 99.5% and higher average speeds than the baselines, which is concrete practical evidence that the method runs and performs in simulation.

Referee Report

2 major / 2 minor

Summary. The paper presents a neural MPC framework for systems with uncontrollable stochastic agents whose behaviors depend on both their own and the system's states. A learned predictor is combined with a region-wise robust conformal prediction scheme to construct time-dependent uncertainty bounds claimed to remain valid under policy-induced distribution shifts; these bounds are integrated into a chance-constrained MPC formulation. The resulting non-convex problem is solved via a two-loop iterative sequential convex programming algorithm (inner loop solves convexified subproblems with fixed bounds; outer loop refines bounds based on updated controls). Convergence guarantees and optimality analysis are provided, and the approach is illustrated in an autonomous driving scenario with interactive pedestrians, reporting success rates exceeding 99.5% and higher average speeds than baselines.

Significance. If the claimed formal probabilistic guarantees hold and the two-loop SCP converges while preserving coverage, the work would advance safe learning-based control by addressing closed-loop distribution shifts in a structured manner. The experimental demonstration in multi-pedestrian scenarios indicates potential practical gains in safety-efficiency trade-offs.

major comments (2)

[Abstract (region-wise robust conformal prediction scheme)] The validity of the region-wise robust conformal prediction under policy-induced distribution shifts (induced when the MPC control sequence determines the regions visited) is load-bearing for the chance-constraint satisfaction, yet the abstract and method description provide no explicit argument showing that the robustness mechanism and partitioning preserve coverage once the predictor outputs enter the closed-loop dynamics via the two-loop SCP.
[Convergence guarantees and optimality analysis] The convergence analysis of the two-loop SCP must establish that outer-loop bound refinement does not invalidate the time-dependent uncertainty bounds; if the inner-loop convexification alters the effective trajectory distribution without a corresponding coverage adjustment, the probabilistic guarantees do not automatically carry over.

minor comments (2)

[Experimental results] The experimental section should report the exact number of Monte Carlo trials, variance of success rates, and precise definition of 'higher average speeds' to allow direct comparison with baselines.
[MPC formulation] Notation for the time-dependent bounds and region partitioning should be introduced with explicit symbols before their use in the MPC formulation to improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address the two major concerns point by point below, providing clarifications on the coverage guarantees and convergence properties while outlining the revisions we will make to improve explicitness in the manuscript.

read point-by-point responses

Referee: [Abstract (region-wise robust conformal prediction scheme)] The validity of the region-wise robust conformal prediction under policy-induced distribution shifts (induced when the MPC control sequence determines the regions visited) is load-bearing for the chance-constraint satisfaction, yet the abstract and method description provide no explicit argument showing that the robustness mechanism and partitioning preserve coverage once the predictor outputs enter the closed-loop dynamics via the two-loop SCP.

Authors: We agree that the abstract and high-level method overview would benefit from a more explicit reference to coverage preservation. Section III-B of the manuscript proves that the region-wise robust conformal prediction maintains the target coverage probability for any distribution shift, including policy-induced shifts, because the state-space partitioning is fixed during offline calibration and the robustness term accounts for the worst-case error within each region independently of the control sequence. The two-loop SCP operates within this fixed partitioning, so the closed-loop trajectory distribution does not invalidate the bounds. We will revise the abstract and the opening of Section II to include a concise statement referencing Theorem 1 on coverage preservation under arbitrary shifts. revision: yes
Referee: [Convergence guarantees and optimality analysis] The convergence analysis of the two-loop SCP must establish that outer-loop bound refinement does not invalidate the time-dependent uncertainty bounds; if the inner-loop convexification alters the effective trajectory distribution without a corresponding coverage adjustment, the probabilistic guarantees do not automatically carry over.

Authors: We appreciate this observation on the interplay between SCP iterations and probabilistic guarantees. Section IV-C shows that the outer loop monotonically refines the time-dependent bounds based on the updated control sequence while the inner-loop convex subproblems are solved subject to the current bounds; because the conformal bounds are constructed to be valid for any feasible trajectory (via the robust region-wise mechanism), refinement cannot reduce coverage below the calibrated level. The final converged solution is feasible for the original non-convex problem with the refined bounds. We will add an explicit remark in Section IV-C linking the convergence result to preservation of the chance-constraint satisfaction probability. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation relies on independently proposed conformal scheme and SCP convergence

full rationale

The paper's central claims rest on a newly proposed region-wise robust conformal prediction method to handle policy-induced shifts and a two-loop SCP algorithm with stated convergence analysis. No equations or steps in the abstract or description reduce a prediction or guarantee to a fitted input by construction, nor do they rely on self-citations for load-bearing uniqueness or ansatz smuggling. The formal guarantees are presented as following from the method's design rather than tautological redefinition of inputs. This is the common case of a self-contained proposal without detectable circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides insufficient detail to enumerate specific free parameters, axioms, or invented entities; no explicit fitting or new postulates are described beyond standard learning and optimization techniques.

pith-pipeline@v0.9.0 · 5721 in / 1039 out tokens · 39575 ms · 2026-05-22T21:12:46.718651+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Safe Planning in Interactive Environments via Iterative Policy Updates and Adversarially Robust Conformal Prediction
eess.SY 2025-11 conditional novelty 7.0

The work develops an iterative safe planner that adjusts conformal prediction bounds across policy updates via sensitivity analysis to maintain distribution-free safety guarantees despite interaction-induced distribut...