arxiv: 2604.05798 · v1 · submitted 2026-04-07 · 📡 eess.SY · cs.SY

Recognition: no theorem link

Robust Nonlinear System Identification in Reproducing Kernel Hilbert Spaces via Scenario Optimization

Jannis L\"ubsen , Annika Eichler

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:32 UTC · model grok-4.3

classification 📡 eess.SY cs.SY

keywords nonlinear system identificationreproducing kernel Hilbert spacesscenario optimizationone-step prediction tubesfinite-dimensional approximationrobust learningviolation guarantees

0 comments

The pith

Finite-dimensional approximations of bounded RKHS sets let scenario optimization certify one-step predictors for nonlinear systems without prior norm bounds.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to build one-step prediction tubes for unknown nonlinear systems by first shrinking the infinite-dimensional RKHS hypothesis set down to a finite subspace. Bounds on n-widths together with a greedy basis selection give an explicit way to choose the subspace size, and for Sobolev-equivalent kernels this size grows in a controlled way with smoothness and input dimension. Once the problem is finite-dimensional, convex scenario optimization produces a predictor together with explicit probabilistic violation guarantees. The key practical gain is that these guarantees hold without any a priori bound on the true system's RKHS norm or Lipschitz constant. The method is illustrated on an obstacle-avoidance task, while the authors flag dimensional scaling and the i.i.d. data requirement as current limits.

Core claim

Approximating the bounded RKHS hypothesis set by a finite-dimensional subspace via n-width bounds and greedy basis reduction reduces robust nonlinear system identification to a convex scenario program. This program returns a predictor whose one-step prediction error satisfies explicit violation-probability bounds that do not depend on knowing the true system's RKHS norm or Lipschitz constant in advance.

What carries the argument

Finite-dimensional subspace obtained from n-width estimates and greedy basis reduction of the bounded RKHS hypothesis set.

If this is right

The learned one-step predictor comes with explicit, computable probabilistic bounds on its prediction error.
No separate estimate or upper bound on the true system's RKHS norm is required to obtain the guarantees.
For kernels whose native spaces are norm-equivalent to Sobolev spaces, the required basis dimension can be calculated from smoothness and input dimension.
The same finite-dimensional reduction applies to constructing prediction tubes for control tasks such as obstacle avoidance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If one-step tubes can be propagated forward in time while preserving the violation probabilities, the approach would directly support finite-horizon robust planning.
The greedy basis selection step could be replaced by other low-rank or sparse approximations to improve scaling with input dimension.
Relaxing the i.i.d. assumption to mixing or ergodic data would broaden applicability to real-world time-series identification.
The same finite-subspace idea might transfer to other kernel-based robust learning settings outside control, such as regression with uncertainty quantification.

Load-bearing premise

The n-width and greedy approximation must faithfully capture the bounded RKHS hypothesis set of the unknown true system, and all data samples must be independent and identically distributed.

What would settle it

Collect a new batch of i.i.d. data from a system inside the original RKHS; if the empirical violation rate of the learned predictor exceeds the scenario-derived probability bound, the finite-subspace guarantee is falsified.

Figures

Figures reproduced from arXiv: 2604.05798 by Annika Eichler, Jannis L\"ubsen.

**Figure 1.** Figure 1: Error governed by the Kolmogorov n−widths. Definition 1 (Approximation numbers & n-widths): Let A and B be n.v.s. such that id : A → B is bounded, and let AE be a ball in A. The approximation number of AE in B is defined as an(AE, B) = inf T ∈L(A,B) rank(T)≤n sup f∈AE ∥f − T f∥B, (6) where the infimum is taken over all bounded linear operators T that map from A to B with rank(T) ≤ n. The Kolmogorov n-width… view at source ↗

**Figure 2.** Figure 2: Obstacle avoidance example using a robust RKHS model of the [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

read the original abstract

This paper proposes a method for constructing one-step prediction tubes for nonlinear systems using reproducing kernel Hilbert spaces. We approximate a bounded reproducing kernel Hilbert space (RKHS) hypothesis set by a finite-dimensional subspace using bounds based on n-widths and a greedy algorithm for basis reduction. For kernels whose native spaces are norm-equivalent to Sobolev spaces, we derive how the required basis size scales with kernel smoothness and input dimension. This finite-dimensional representation enables the use of convex scenario optimization to obtain violation guarantees for the learned predictor without requiring an a priori bound on the true system's RKHS norm or Lipschitz constant. The method is demonstrated on an obstacle-avoidance task. We also discuss the main limitations of the current analysis, including dimensional scaling and dependence on i.i.d. data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper links n-width approximation plus greedy reduction to scenario optimization so that one-step RKHS predictors can carry violation probabilities without an a priori norm bound, but the error scaling with unknown M still needs to be verified in the derivations.

read the letter

The central contribution is a concrete pipeline that first reduces the bounded RKHS hypothesis set to a finite-dimensional convex set using n-width bounds and a greedy basis algorithm, then feeds that set into convex scenario optimization. This produces probabilistic guarantees on the learned one-step predictor. For kernels equivalent to Sobolev spaces they also give explicit scaling of the required basis size with smoothness and input dimension. That scaling result is the part most likely to be used by others who already work with kernel methods in control.

Referee Report

2 major / 2 minor

Summary. The paper proposes approximating a bounded RKHS hypothesis set for nonlinear system identification via n-widths and greedy basis selection to obtain a finite-dimensional convex set. This enables convex scenario optimization to produce probabilistic violation guarantees for one-step predictors without an a priori bound on the true system's RKHS norm or Lipschitz constant. Scaling of the required basis size with kernel smoothness and input dimension is derived for Sobolev-equivalent kernels, with demonstration on an obstacle-avoidance task and discussion of limitations including dimensional scaling and i.i.d. data dependence.

Significance. If the central claim holds—that the n-width/greedy approximation yields a hypothesis set containing the unknown system (without knowing its RKHS norm M) on which scenario optimization transfers valid violation-probability bounds—this would be a useful contribution to robust data-driven control, relaxing common conservatism in Lipschitz or norm-bounded approaches. The explicit scaling results for basis dimension could aid practical tuning in moderate-dimensional settings.

major comments (2)

[derivation of finite-dimensional representation and n-width bounds] The n-width approximation error for a function f with ||f||_H = M is bounded by M times the n-width of the unit ball. The manuscript's derivation of basis size scaling (with smoothness and input dimension) therefore still implicitly depends on M to guarantee that the true system lies within the approximated finite-dimensional set used for scenario optimization. This dependence appears to contradict the claim that no a priori bound on the RKHS norm is required for the violation guarantees to transfer; the setup of the convex hypothesis set in the scenario program must be shown to absorb or eliminate this factor.
[scenario optimization setup and guarantee transfer] Scenario optimization provides violation guarantees only when the sampled constraints are drawn i.i.d. from the true distribution and the hypothesis set is fixed independently of the data. The paper relies on this for the learned predictor, yet the greedy basis selection and n-width approximation are performed on the same data used for identification; it is unclear whether the resulting set remains data-independent in the sense required by the scenario theory cited.

minor comments (2)

[numerical example] The abstract states that the method is demonstrated on an obstacle-avoidance task, but the main text should include quantitative comparison against a baseline that does use an a priori RKHS-norm bound to illustrate the practical benefit of the claimed relaxation.
[preliminaries and main result] Notation for the approximated hypothesis set (e.g., how the finite-dimensional ball radius is chosen) should be introduced earlier and used consistently when stating the scenario program.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review of our manuscript. The comments highlight important aspects of the approximation and scenario optimization setup that require clarification. We address each major comment below and indicate planned revisions to improve the presentation and rigor of the claims.

read point-by-point responses

Referee: [derivation of finite-dimensional representation and n-width bounds] The n-width approximation error for a function f with ||f||_H = M is bounded by M times the n-width of the unit ball. The manuscript's derivation of basis size scaling (with smoothness and input dimension) therefore still implicitly depends on M to guarantee that the true system lies within the approximated finite-dimensional set used for scenario optimization. This dependence appears to contradict the claim that no a priori bound on the RKHS norm is required for the violation guarantees to transfer; the setup of the convex hypothesis set in the scenario program must be shown to absorb or eliminate this factor.

Authors: We appreciate this precise observation on the scaling. The n-width error bound is indeed of the form M · d_n(·), so an a priori tolerance on approximation error would require knowledge of M to select n. However, our construction defines the finite-dimensional convex hypothesis set as the image of a sufficiently large ball in the coefficient space (with radius chosen conservatively from data norms or a fixed large constant independent of the unknown M). This set is guaranteed to contain all functions whose RKHS norm is at most that radius, and the scenario program optimizes over it without ever using the true M. The probabilistic violation bounds then hold for the returned predictor by standard scenario theory, while the n-width analysis only serves to justify that the chosen basis dimension makes the subspace expressive enough for moderate-dimensional problems. We will revise the manuscript to explicitly describe the radius selection rule, qualify the “no a priori bound” claim to refer to the scenario guarantees themselves, and add a remark on the implicit dependence for strict containment. revision: partial
Referee: [scenario optimization setup and guarantee transfer] Scenario optimization provides violation guarantees only when the sampled constraints are drawn i.i.d. from the true distribution and the hypothesis set is fixed independently of the data. The paper relies on this for the learned predictor, yet the greedy basis selection and n-width approximation are performed on the same data used for identification; it is unclear whether the resulting set remains data-independent in the sense required by the scenario theory cited.

Authors: We agree that the standard scenario results cited in the paper assume a hypothesis set chosen independently of the optimization samples. The greedy selection step does introduce data dependence. To restore the required independence we will revise the method description to recommend performing basis selection on a small, disjoint hold-out subset (or on a separate pilot dataset), after which the scenario program is solved on the remaining i.i.d. samples with a now-fixed finite-dimensional set. We will also cite recent extensions of scenario theory that accommodate limited data dependence (e.g., when the set is parameterized by a finite number of statistics) and discuss the resulting sample-complexity overhead. These changes preserve the core guarantees while addressing the referee’s concern. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation relies on standard n-width theory and scenario optimization without self-referential reductions

full rationale

The paper constructs a finite-dimensional approximation to the bounded RKHS hypothesis set via n-widths and greedy basis selection, derives the required dimension scaling from known properties of Sobolev-equivalent kernels, and then applies convex scenario optimization to obtain probabilistic violation guarantees. These steps invoke established results from approximation theory and robust optimization rather than defining any quantity in terms of itself or renaming fitted parameters as independent predictions. No load-bearing self-citations, uniqueness theorems imported from the authors' prior work, or ansatzes smuggled via citation appear in the abstract or described chain. The central claim that guarantees hold without an a priori RKHS-norm bound is presented as a consequence of the finite-dimensional convex set construction, not as a tautology equivalent to the inputs by construction. The analysis is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard RKHS and scenario-optimization theory plus two domain assumptions needed for the scaling result and the guarantee transfer.

free parameters (1)

Basis size N
Chosen via n-width bounds to achieve a target approximation error; its concrete value depends on desired accuracy and is not fitted to the identification data itself.

axioms (2)

domain assumption The native space of the chosen kernel is norm-equivalent to a Sobolev space
Invoked to derive how the required basis size scales with kernel smoothness and input dimension.
domain assumption Data samples are i.i.d.
Required for the scenario-optimization violation probability bounds to hold.

pith-pipeline@v0.9.0 · 5429 in / 1521 out tokens · 76796 ms · 2026-05-10T18:32:39.182873+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

23 extracted references · 14 canonical work pages

[1]

Adams and John J

Robert A. Adams and John J. F. Fournier.Sobolev Spaces. 2nd ed. Pure and Applied Mathematics v
[2]

Amsterdam Boston: Academic Press, 2003.ISBN: 978-0-12-044143-3

2003
[3]

Convergence Rates for Greedy Algorithms in Reduced Basis Methods

Peter Binev et al. “Convergence Rates for Greedy Algorithms in Reduced Basis Methods”. In:SIAM Journal on Mathematical Analysis43.3 (2011).DOI: 10.1137/100795772

work page doi:10.1137/100795772 2011
[4]

Stochastic Data-Driven Model Predictive Control Using Gaussian Processes

Eric Bradford et al. “Stochastic Data-Driven Model Predictive Control Using Gaussian Processes”. In: Computers & Chemical Engineering139 (Aug. 2020). DOI:10 . 1016 / j . compchemeng . 2020 . 106844

2020
[5]

Wait-and-Judge Scenario Optimization

M. C. Campi and S. Garatti. “Wait-and-Judge Scenario Optimization”. In:Mathematical Programming167.1 (Jan. 2018), pp. 155–189.DOI:10.1007/s10107- 016-1056-9

work page doi:10.1007/s10107- 2018
[6]

Campi and Simone Garatti.Compression, Generalization and Learning

Marco C. Campi and Simone Garatti.Compression, Generalization and Learning. 2024. arXiv:2301 . 12767 [cs.LG].URL:https://arxiv.org/ abs/2301.12767

work page arXiv 2024
[7]

Campi and Simone Garatti.Introduction to the Scenario Approach

Marco C. Campi and Simone Garatti.Introduction to the Scenario Approach. Philadelphia, PA: Society for Industrial and Applied Mathematics, Nov. 5, 2018. ISBN: 978-1-61197-543-7 978-1-61197-544-4.DOI: 10.1137/1.9781611975444

work page doi:10.1137/1.9781611975444 2018
[8]

Sayak Ray Chowdhury and Aditya Gopalan.On Ker- nelized Multi-armed Bandits. 2017. arXiv:1704 . 00445 [cs.LG].URL:https://arxiv.org/ abs/1704.00445

work page arXiv 2017
[9]

The Exis- tence and Uniqueness of Solutions for Kernel-Based System Identification

Mohammad Khosravi and Roy S. Smith. “The Exis- tence and Uniqueness of Solutions for Kernel-Based System Identification”. In:Automatica148 (2023), p. 110728.DOI:10 . 1016 / j . automatica . 2022.110728

work page arXiv 2023
[10]

Gaussian Process Model Based Pre- dictive Control

J. Kocijan et al. “Gaussian Process Model Based Pre- dictive Control”. In:Proceedings of the 2004 Amer- ican Control Conference. Proceedings of the 2004 American Control Conference. Boston, MA, USA: IEEE, 2004.ISBN: 978-0-7803-8335-7.DOI:10 . 23919/ACC.2004.1383790

work page arXiv 2004
[11]

Springer Cham, 2016.ISBN: 978-3-319-21021-6.DOI:10

Ju ˇs Kocijan.Modelling and Control of Dynamic Sys- tems Using Gaussian Process Models. Springer Cham, 2016.ISBN: 978-3-319-21021-6.DOI:10 . 1007 / 978-3-319-21021-6

2016
[12]

July 4, 2024.DOI: 10.48550/arXiv.2403.18809

Frederik K ¨ohne et al.L ∞-Error Bounds for Approxi- mations of the Koopman Operator by Kernel Extended Dynamic Mode Decomposition. July 4, 2024.DOI: 10.48550/arXiv.2403.18809. arXiv:2403. 18809 [math]. Pre-published

work page doi:10.48550/arxiv.2403.18809 2024
[13]

Jannis Luebsen and Annika Eichler.An Analysis of Safety Guarantees in Multi-Task Bayesian Optimiza- tion. 2025. arXiv:2503 . 08555 [cs.LG].URL: https://arxiv.org/abs/2503.08555

work page arXiv 2025
[14]

Paper artifacts

Jannis Luebsen and Annika Eichler.code-Robust- Nonlinear-System- Identification-using-Reproducing- Kernel-Hilbert- Spaces: Code for Final Submission. Version v1.0. Apr. 2026.DOI:10.5281/zenodo. 19438697

work page doi:10.5281/zenodo 2026
[15]

System Identification Using Kernel-Based Regularization: New Insights on Stabil- ity and Consistency Issues

Gianluigi Pillonetto. “System Identification Using Kernel-Based Regularization: New Insights on Stabil- ity and Consistency Issues”. In:Automatica93 (2018), pp. 321–332.DOI:10 . 1016 / j . automatica . 2018.03.065

2018
[16]

A Survey on Scenario Theory, Complex- ity, and Compression-Based Learning and General- ization

Roberto Rocchetta, Alexander Mey, and Frans A. Oliehoek. “A Survey on Scenario Theory, Complex- ity, and Compression-Based Learning and General- ization”. In:IEEE Transactions on Neural Networks and Learning Systems35.12 (2024), pp. 16985–16999. DOI:10.1109/TNNLS.2023.3308828

work page doi:10.1109/tnnls.2023.3308828 2024
[17]

Anna Scampicchio et al.Gaussian Processes for Dy- namics Learning in Model Predictive Control. Feb. 4, 2025.DOI:10 . 48550 / arXiv . 2502 . 02310. arXiv:2502.02310 [eess]. Pre-published

work page arXiv 2025
[18]

A Short Note on the Comparison of Interpolation Widths, Entropy Numbers, and Kol- mogorov Widths

Ingo Steinwart. “A Short Note on the Comparison of Interpolation Widths, Entropy Numbers, and Kol- mogorov Widths”. In:Journal of Approximation The- ory215 (Mar. 2017).DOI:10 . 1016 / j . jat . 2016.11.006

2017
[19]

Primal and Dual Model Representations in Kernel-Based Learning

Johan A.K. Suykens, Carlos Alzate, and Kristiaan Pelckmans. “Primal and Dual Model Representations in Kernel-Based Learning”. In:Statistics Surveys4 (2010).DOI:10.1214/09-SS052

work page doi:10.1214/09-ss052 2010
[20]

Acados – a Modular Open- Source Framework for Fast Embedded Optimal Con- trol

Robin Verschueren et al. “Acados – a Modular Open- Source Framework for Fast Embedded Optimal Con- trol”. In:Mathematical Programming Computation (2021)

2021
[21]

Cambridge Monographs on Applied and Computa- tional Mathematics

Holger Wendland.Scattered Data Approximation. Cambridge Monographs on Applied and Computa- tional Mathematics. 2004.ISBN: 978-0-521-84335-5

2004
[22]

Journal of Nonlinear Science , 1307–1346doi:10.1007/s00332-015- 9258-5

Matthew O. Williams, Ioannis G. Kevrekidis, and Clarence W. Rowley. “A Data–Driven Approximation of the Koopman Operator: Extending Dynamic Mode Decomposition”. In:Journal of Nonlinear Science 25.6 (Dec. 1, 2015).DOI:10.1007/s00332-015- 9258-5

work page doi:10.1007/s00332-015- 2015
[23]

Error Bounds for Kernel-Based Linear System Identification With Un- known Hyperparameters

Mingzhou Yin and Roy S. Smith. “Error Bounds for Kernel-Based Linear System Identification With Un- known Hyperparameters”. In:IEEE Control Systems Letters7 (2023).DOI:10 . 1109 / LCSYS . 2023 . 3287305

2023