arxiv: 2604.09118 · v1 · submitted 2026-04-10 · 📡 eess.SY · cs.SY

Recognition: unknown

Efficient Uniform Feasible Set Sampling for Approximate Linear MPC

Elias Milios , Felix Berkel , Felix Gruber , Melanie N. Zeilinger , Kim P. Wabersich

Authors on Pith no claims yet

Pith reviewed 2026-05-10 16:38 UTC · model grok-4.3

classification 📡 eess.SY cs.SY

keywords Model Predictive ControlHit-and-RunUniform SamplingFeasible SetApproximate MPCLinear ProgrammingPolyhedral Constraints

0 comments

The pith

Linear MPC Hit-and-Run sampler locates feasible boundaries with one linear program per direction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Training neural network approximations to model predictive controllers requires many samples of states that are feasible under the MPC constraints. Because the feasible set is defined only implicitly, drawing uniform random samples from it has been expensive. The proposed LMPC-HR method solves this by turning the boundary-finding step of hit-and-run sampling into a single convex linear program. Numerical tests show this produces the required uniform samples an order of magnitude faster than previous methods.

Core claim

For linear MPC with polyhedral constraints, the feasible set boundaries along search directions in hit-and-run sampling can be found exactly by solving one convex linear program instead of iterative line searches.

What carries the argument

LMPC-HR sampler that encodes the feasible-set boundary search as a convex linear program.

If this is right

The cost of generating training data for approximate MPC drops substantially.
Uniform sampling from high-dimensional polyhedral sets becomes feasible for practical controller design.
Approximate policies can be trained on larger and more representative data sets without prohibitive compute.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same single-LP boundary technique could apply to other sampling tasks in constrained optimization.
Faster sampling may allow real-time retraining or adaptation of approximate controllers during operation.
Integration with existing MPC toolboxes would lower barriers to using learned controllers in safety-critical applications.

Load-bearing premise

The polyhedral constraints of the linear MPC problem allow the exact location of the feasible set boundary in any direction to be recovered by solving a single convex linear program.

What would settle it

If experiments on standard linear MPC benchmark problems show that LMPC-HR takes as much or more time as conventional hit-and-run sampling, or produces non-uniform samples, the claimed efficiency gain would be falsified.

Figures

Figures reproduced from arXiv: 2604.09118 by Elias Milios, Felix Berkel, Felix Gruber, Kim P. Wabersich, Melanie N. Zeilinger.

read the original abstract

Model Predictive Control (MPC) offers safe and near-optimal control but suffers from high computational costs. Approximate MPC (AMPC) mitigates this by learning a cheaper surrogate policy, typically by training a neural network on state-MPC input pairs. Generating training data is a major bottleneck, requiring solving the MPC for numerous states sampled from its feasible set. Since this feasible set is implicitly defined and unknown, efficient sampling is nontrivial but crucial. We propose the linear MPC Hit-and-Run (LMPC-HR) sampler for linear MPC with polyhedral constraints. We identify the feasible set boundaries along search directions, a crucial step within HR, by formulating the problem as a convex linear program, replacing expensive iterative searches with a single optimization step. A numerical study demonstrates that LMPC-HR achieves an order of magnitude reduction in computation time for generating uniformly distributed samples from the feasible set compared to naive baselines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LMPC-HR recasts Hit-and-Run boundary search as one LP per direction for linear MPC polyhedra and reports an order-of-magnitude speedup, but the uniformity claim needs explicit checks on degenerate directions.

read the letter

The central contribution is a tailored Hit-and-Run variant that finds the feasible interval along each random direction by solving a single convex LP instead of iterative line searches. For polyhedral sets defined by linear MPC constraints this is a direct algorithmic shortcut that prior sampling work did not apply in this form. The numerical study shows roughly tenfold faster sample generation than naive baselines, which matters for the data-generation step in approximate MPC. That part is useful and cleanly motivated. The soft spots are around numerical robustness and experimental detail. When a direction lies parallel to a facet the LP can produce a zero row or near-zero coefficient, and a generic solver may return an interior point rather than the true boundary; the abstract gives no indication of perturbation, active-set post-processing, or degeneracy handling, so the exact-chord assumption required for the standard uniformity proof is not obviously preserved. The reported speedup also lacks information on state dimension, number of samples, baseline implementations, and run-to-run variability, making it hard to judge how general the gain is. This paper is aimed at people who generate training data for learned MPC policies or who need uniform samples from polyhedral feasible sets. A reader working on sampling algorithms or resource-limited approximate control would find the LP reformulation worth looking at. It deserves peer review because the core idea is straightforward to verify against convex solvers and the potential practical payoff is clear, even though the degeneracy and reproducibility questions need to be settled in revision.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes the linear MPC Hit-and-Run (LMPC-HR) sampler for generating uniformly distributed samples from the polyhedral feasible set of linear model predictive control problems. It replaces iterative line searches in the standard Hit-and-Run algorithm with a single convex linear program to identify the feasible interval along each random direction. A numerical study is presented claiming an order-of-magnitude reduction in computation time compared to naive baselines.

Significance. If the proposed sampler maintains the uniformity properties of Hit-and-Run while achieving the reported efficiency gains, it would provide a valuable tool for efficiently generating training data for approximate MPC methods, addressing a key bottleneck in learning-based control. The approach leverages the polyhedral structure specific to linear MPC, which is a strength.

major comments (2)

[LP-based boundary recovery (methods section)] The LP formulation for recovering the feasible interval along a direction d (max t s.t. A(x0 + t d) <= b, or equivalently (A d) t <= b - A x0) is presented as exactly recovering the chord endpoints. However, when d is parallel to a facet (A_i d = 0), the constraint reduces to 0*t <= 0. Standard LP solvers may return an arbitrary feasible t or an interior point rather than the true boundary when the row is exactly or numerically zero, violating the exact-chord requirement for the Hit-and-Run uniformity proof. No perturbation, active-set post-processing, or degeneracy check is described.
[Numerical study] The numerical study claims an order-of-magnitude reduction in computation time but supplies no information on problem dimensions (state/input sizes, number of inequalities), number of samples drawn, concrete baseline implementations (e.g., how the naive iterative line search is coded), run-to-run variability, or any diagnostic confirming that the generated samples remain uniformly distributed (e.g., via Kolmogorov-Smirnov tests against the target measure or comparison with a known uniform sampler).

minor comments (2)

[Abstract] The abstract refers to 'naive baselines' without naming them; the methods or results section should explicitly state the competing algorithms and their implementations.
[Preliminaries] Notation for the polyhedral set (A, b) and the current point x0 should be introduced once and used consistently; a small table summarizing symbols would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments on our manuscript. We address each major comment point by point below, indicating the revisions we will make to improve clarity, rigor, and reproducibility.

read point-by-point responses

Referee: The LP formulation for recovering the feasible interval along a direction d (max t s.t. A(x0 + t d) <= b, or equivalently (A d) t <= b - A x0) is presented as exactly recovering the chord endpoints. However, when d is parallel to a facet (A_i d = 0), the constraint reduces to 0*t <= 0. Standard LP solvers may return an arbitrary feasible t or an interior point rather than the true boundary when the row is exactly or numerically zero, violating the exact-chord requirement for the Hit-and-Run uniformity proof. No perturbation, active-set post-processing, or degeneracy check is described.

Authors: We appreciate the referee highlighting this potential degeneracy issue in the LP-based boundary recovery. In exact arithmetic, a zero row (A_i d = 0) with a feasible x0 yields a non-binding constraint that does not alter the computed maximum t, which remains determined by the other inequalities. However, we acknowledge that numerical LP solvers can encounter difficulties with exactly or near-zero coefficients, potentially returning suboptimal or arbitrary t values. To resolve this while preserving the exact-chord property required for uniformity, we will revise the methods section to incorporate a small random perturbation to d (on the order of machine epsilon) when any |A_i d| falls below a threshold, or add an active-set verification step post-LP solve to confirm and adjust boundary points if needed. These additions ensure robustness without significantly impacting the reported efficiency gains. revision: yes
Referee: The numerical study claims an order-of-magnitude reduction in computation time but supplies no information on problem dimensions (state/input sizes, number of inequalities), number of samples drawn, concrete baseline implementations (e.g., how the naive iterative line search is coded), run-to-run variability, or any diagnostic confirming that the generated samples remain uniformly distributed (e.g., via Kolmogorov-Smirnov tests against the target measure or comparison with a known uniform sampler).

Authors: We agree that the numerical study requires substantially more detail for reproducibility and to substantiate the uniformity and efficiency claims. We will expand this section to report the specific problem dimensions (state size, input size, and number of inequalities in the polyhedral set), the total number of samples generated per experiment, the exact implementation of the naive baseline (e.g., iterative bisection or golden-section search along each direction with a specified tolerance), mean and standard deviation of run times over multiple independent trials, and explicit uniformity diagnostics including Kolmogorov-Smirnov tests on marginal distributions against the uniform measure as well as low-dimensional projection visualizations comparing against a reference uniform sampler. These revisions will directly address the gaps and strengthen the empirical validation. revision: yes

Circularity Check

0 steps flagged

No significant circularity; algorithmic reformulation with external verifiability

full rationale

The paper proposes LMPC-HR as an algorithmic replacement of iterative line searches in Hit-and-Run sampling by a single LP per direction to recover exact chord endpoints for polyhedral feasible sets. No derivation step reduces by construction to a fitted parameter, self-definition, or self-citation chain; the uniformity guarantee is imported from the standard Hit-and-Run literature and the LP encoding is a direct transcription of the polyhedron {x | A x ≤ b}. Correctness can be checked against any external convex solver and the known Markov-chain properties of HR, satisfying the criteria for non-circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The contribution rests on standard convex optimization and the classical Hit-and-Run framework; no new free parameters, invented entities, or non-standard axioms are introduced in the abstract.

axioms (1)

domain assumption For linear MPC with polyhedral constraints the feasible set is convex and its intersection with any line can be recovered by solving a convex linear program.
Invoked when the paper replaces iterative searches with a single optimization step.

pith-pipeline@v0.9.0 · 5464 in / 1261 out tokens · 27571 ms · 2026-05-10T16:38:59.045530+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

3 extracted references · 3 canonical work pages

[1]

Abu-Ali, M., Berkel, F., Manderla, M., Reimann, S., Ken- nel, R., and Abdelrahem, M. (2022). Deep learning- based long-horizon MPC: Robust, high performing, and computationally efficient control for PMSM drives. IEEE Transactions on Power Electronics, 37(10), 12486–12501. Andersson, J.A.E., Gillis, J., Horn, G., Rawlings, J.B., and Diehl, M. (2019). CasAD...

work page arXiv 2022
[2]

Feasible samples are indicated by the green dots and infeasible samples by the red dots

Visual comparison of 1000 uniformly distributed states sampled fromX N via the baseline methods UVRS (Far left), DRS-HR (Mid-left), BS-HR (Mid-right), and the proposed LMPC-HR sampler (Far right). Feasible samples are indicated by the green dots and infeasible samples by the red dots. The state constraint setXis illustrated by the blue dashed box. Bempora...

work page arXiv 2002
[3]

A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning

Nob Hill Publishing Madison, WI. Ross, S., Gordon, G.J., and Bagnell, J.A. (2011). A reduction of imitation learning and structured prediction to no-regret online learning. URL https://arxiv.org/abs/1011.0686. Rudolf, D. and Ullrich, M. (2018). Comparison of Hit-and- Run, slice sampler and random walk metropolis.Journal of Applied Probability, 55(4), 1186...

work page Pith review arXiv 2011