Recognition: unknown
Efficient Uniform Feasible Set Sampling for Approximate Linear MPC
Pith reviewed 2026-05-10 16:38 UTC · model grok-4.3
The pith
Linear MPC Hit-and-Run sampler locates feasible boundaries with one linear program per direction.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For linear MPC with polyhedral constraints, the feasible set boundaries along search directions in hit-and-run sampling can be found exactly by solving one convex linear program instead of iterative line searches.
What carries the argument
LMPC-HR sampler that encodes the feasible-set boundary search as a convex linear program.
If this is right
- The cost of generating training data for approximate MPC drops substantially.
- Uniform sampling from high-dimensional polyhedral sets becomes feasible for practical controller design.
- Approximate policies can be trained on larger and more representative data sets without prohibitive compute.
Where Pith is reading between the lines
- The same single-LP boundary technique could apply to other sampling tasks in constrained optimization.
- Faster sampling may allow real-time retraining or adaptation of approximate controllers during operation.
- Integration with existing MPC toolboxes would lower barriers to using learned controllers in safety-critical applications.
Load-bearing premise
The polyhedral constraints of the linear MPC problem allow the exact location of the feasible set boundary in any direction to be recovered by solving a single convex linear program.
What would settle it
If experiments on standard linear MPC benchmark problems show that LMPC-HR takes as much or more time as conventional hit-and-run sampling, or produces non-uniform samples, the claimed efficiency gain would be falsified.
Figures
read the original abstract
Model Predictive Control (MPC) offers safe and near-optimal control but suffers from high computational costs. Approximate MPC (AMPC) mitigates this by learning a cheaper surrogate policy, typically by training a neural network on state-MPC input pairs. Generating training data is a major bottleneck, requiring solving the MPC for numerous states sampled from its feasible set. Since this feasible set is implicitly defined and unknown, efficient sampling is nontrivial but crucial. We propose the linear MPC Hit-and-Run (LMPC-HR) sampler for linear MPC with polyhedral constraints. We identify the feasible set boundaries along search directions, a crucial step within HR, by formulating the problem as a convex linear program, replacing expensive iterative searches with a single optimization step. A numerical study demonstrates that LMPC-HR achieves an order of magnitude reduction in computation time for generating uniformly distributed samples from the feasible set compared to naive baselines.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes the linear MPC Hit-and-Run (LMPC-HR) sampler for generating uniformly distributed samples from the polyhedral feasible set of linear model predictive control problems. It replaces iterative line searches in the standard Hit-and-Run algorithm with a single convex linear program to identify the feasible interval along each random direction. A numerical study is presented claiming an order-of-magnitude reduction in computation time compared to naive baselines.
Significance. If the proposed sampler maintains the uniformity properties of Hit-and-Run while achieving the reported efficiency gains, it would provide a valuable tool for efficiently generating training data for approximate MPC methods, addressing a key bottleneck in learning-based control. The approach leverages the polyhedral structure specific to linear MPC, which is a strength.
major comments (2)
- [LP-based boundary recovery (methods section)] The LP formulation for recovering the feasible interval along a direction d (max t s.t. A(x0 + t d) <= b, or equivalently (A d) t <= b - A x0) is presented as exactly recovering the chord endpoints. However, when d is parallel to a facet (A_i d = 0), the constraint reduces to 0*t <= 0. Standard LP solvers may return an arbitrary feasible t or an interior point rather than the true boundary when the row is exactly or numerically zero, violating the exact-chord requirement for the Hit-and-Run uniformity proof. No perturbation, active-set post-processing, or degeneracy check is described.
- [Numerical study] The numerical study claims an order-of-magnitude reduction in computation time but supplies no information on problem dimensions (state/input sizes, number of inequalities), number of samples drawn, concrete baseline implementations (e.g., how the naive iterative line search is coded), run-to-run variability, or any diagnostic confirming that the generated samples remain uniformly distributed (e.g., via Kolmogorov-Smirnov tests against the target measure or comparison with a known uniform sampler).
minor comments (2)
- [Abstract] The abstract refers to 'naive baselines' without naming them; the methods or results section should explicitly state the competing algorithms and their implementations.
- [Preliminaries] Notation for the polyhedral set (A, b) and the current point x0 should be introduced once and used consistently; a small table summarizing symbols would improve readability.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments on our manuscript. We address each major comment point by point below, indicating the revisions we will make to improve clarity, rigor, and reproducibility.
read point-by-point responses
-
Referee: The LP formulation for recovering the feasible interval along a direction d (max t s.t. A(x0 + t d) <= b, or equivalently (A d) t <= b - A x0) is presented as exactly recovering the chord endpoints. However, when d is parallel to a facet (A_i d = 0), the constraint reduces to 0*t <= 0. Standard LP solvers may return an arbitrary feasible t or an interior point rather than the true boundary when the row is exactly or numerically zero, violating the exact-chord requirement for the Hit-and-Run uniformity proof. No perturbation, active-set post-processing, or degeneracy check is described.
Authors: We appreciate the referee highlighting this potential degeneracy issue in the LP-based boundary recovery. In exact arithmetic, a zero row (A_i d = 0) with a feasible x0 yields a non-binding constraint that does not alter the computed maximum t, which remains determined by the other inequalities. However, we acknowledge that numerical LP solvers can encounter difficulties with exactly or near-zero coefficients, potentially returning suboptimal or arbitrary t values. To resolve this while preserving the exact-chord property required for uniformity, we will revise the methods section to incorporate a small random perturbation to d (on the order of machine epsilon) when any |A_i d| falls below a threshold, or add an active-set verification step post-LP solve to confirm and adjust boundary points if needed. These additions ensure robustness without significantly impacting the reported efficiency gains. revision: yes
-
Referee: The numerical study claims an order-of-magnitude reduction in computation time but supplies no information on problem dimensions (state/input sizes, number of inequalities), number of samples drawn, concrete baseline implementations (e.g., how the naive iterative line search is coded), run-to-run variability, or any diagnostic confirming that the generated samples remain uniformly distributed (e.g., via Kolmogorov-Smirnov tests against the target measure or comparison with a known uniform sampler).
Authors: We agree that the numerical study requires substantially more detail for reproducibility and to substantiate the uniformity and efficiency claims. We will expand this section to report the specific problem dimensions (state size, input size, and number of inequalities in the polyhedral set), the total number of samples generated per experiment, the exact implementation of the naive baseline (e.g., iterative bisection or golden-section search along each direction with a specified tolerance), mean and standard deviation of run times over multiple independent trials, and explicit uniformity diagnostics including Kolmogorov-Smirnov tests on marginal distributions against the uniform measure as well as low-dimensional projection visualizations comparing against a reference uniform sampler. These revisions will directly address the gaps and strengthen the empirical validation. revision: yes
Circularity Check
No significant circularity; algorithmic reformulation with external verifiability
full rationale
The paper proposes LMPC-HR as an algorithmic replacement of iterative line searches in Hit-and-Run sampling by a single LP per direction to recover exact chord endpoints for polyhedral feasible sets. No derivation step reduces by construction to a fitted parameter, self-definition, or self-citation chain; the uniformity guarantee is imported from the standard Hit-and-Run literature and the LP encoding is a direct transcription of the polyhedron {x | A x ≤ b}. Correctness can be checked against any external convex solver and the known Markov-chain properties of HR, satisfying the criteria for non-circularity.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption For linear MPC with polyhedral constraints the feasible set is convex and its intersection with any line can be recovered by solving a convex linear program.
Reference graph
Works this paper leans on
-
[1]
Abu-Ali, M., Berkel, F., Manderla, M., Reimann, S., Ken- nel, R., and Abdelrahem, M. (2022). Deep learning- based long-horizon MPC: Robust, high performing, and computationally efficient control for PMSM drives. IEEE Transactions on Power Electronics, 37(10), 12486–12501. Andersson, J.A.E., Gillis, J., Horn, G., Rawlings, J.B., and Diehl, M. (2019). CasAD...
-
[2]
Feasible samples are indicated by the green dots and infeasible samples by the red dots
Visual comparison of 1000 uniformly distributed states sampled fromX N via the baseline methods UVRS (Far left), DRS-HR (Mid-left), BS-HR (Mid-right), and the proposed LMPC-HR sampler (Far right). Feasible samples are indicated by the green dots and infeasible samples by the red dots. The state constraint setXis illustrated by the blue dashed box. Bempora...
-
[3]
A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
Nob Hill Publishing Madison, WI. Ross, S., Gordon, G.J., and Bagnell, J.A. (2011). A reduction of imitation learning and structured prediction to no-regret online learning. URL https://arxiv.org/abs/1011.0686. Rudolf, D. and Ullrich, M. (2018). Comparison of Hit-and- Run, slice sampler and random walk metropolis.Journal of Applied Probability, 55(4), 1186...
work page Pith review arXiv 2011
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.