arxiv: 2604.13857 · v1 · submitted 2026-04-15 · 🧮 math.OC

Recognition: unknown

Mamba Sequence Modeling meets Model Predictive Control

Michiel Cevaal , Thomas de Jong , Mircea Lazar

Authors on Pith no claims yet

Pith reviewed 2026-05-10 12:28 UTC · model grok-4.3

classification 🧮 math.OC

keywords MambaModel Predictive ControlNeural Network PredictorsSISO SystemsMIMO SystemsSequence ModelingDynamical SystemsReference Tracking

0 comments

The pith

Mamba neural networks can serve as accurate multi-step predictors for model predictive control, enabling stable reference tracking in SISO and MIMO systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper adapts the Mamba sequence model into a decoder-only architecture for use as a multi-step predictor in model predictive control. It provides a full mathematical formulation and tests the resulting Mamba-MPC on a Van der Pol oscillator, a four-tank system, and a physical Quanser Aero2 helicopter. The approach achieves closed-loop stability and reference tracking while running faster and with better prediction performance than an equivalent LSTM-based controller.

Core claim

By adjusting the Mamba network to a decoder-only form and deriving an input-output formulation for dynamical systems, Mamba-MPC delivers multi-step predictions of unknown dynamics that suffice for stabilizing and tracking references in both simulated SISO and MIMO plants and on real hardware, with consistent advantages over LSTM-MPC in accuracy and speed.

What carries the argument

The decoder-only Mamba multi-step predictor, which models sequence-to-sequence dynamics of the system to replace the traditional model in the MPC optimization.

Load-bearing premise

The trained Mamba predictor must generate sufficiently accurate forecasts of future states or outputs so that the MPC optimization produces a stabilizing feedback law.

What would settle it

Running the physical experiment on the Quanser Aero2 where the closed-loop system with Mamba-MPC becomes unstable or fails to track the reference while the LSTM version succeeds would disprove the claim.

Figures

Figures reproduced from arXiv: 2604.13857 by Michiel Cevaal, Mircea Lazar, Thomas de Jong.

**Figure 2.** Figure 2: FIGURE 2 [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: FIGURE 3 [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: FIGURE 4 [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 6.** Figure 6: FIGURE 6 [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: FIGURE 7 [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

**Figure 9.** Figure 9: FIGURE 9 [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗

**Figure 11.** Figure 11: alongside its free-body diagram. In this example, the 2-DOF configuration is considered. The Aero2 has inputs u ∈ R 2 corresponding to the voltages supplied to the DC motors of rotor 1 and rotor 2. The system outputs y ∈ R 3 encompassing the pitch angle, sine and cosine components of the yaw. For the initial conditions, pitch rate and yaw rate are added such that the initial condition x0 ∈ R 5 . A TABLE 5… view at source ↗

**Figure 10.** Figure 10: FIGURE 10 [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗

**Figure 12.** Figure 12: In general, Mamba-MPC is able to successfully track the reference signal for the pitch angle and yaw angle. In particular, the mean computation time per iteration is 0.038 s which is below the sampling time of the system at 0.04 s. Overall, Mamba-MPC is able to successfully stabilize and track a piece-wise constant reference on the Aero2 while remaining computationally tractable.1 VIII. Conclusion In this… view at source ↗

read the original abstract

In this paper, we consider the design of Model Predictive Control (MPC) algorithms based on Mamba neural networks. Mamba is a neural network architecture capable of sub-quadratic computational scaling in sequence length with state-of-the-art modeling capabilities. We provide a consistent and complete mathematical description of the Mamba neural network is provided. Then, adjustments and optimizations are made to construct a decoder-only Mamba multi-step predictor for MPC and an input-output formulation is given for sequence-to-sequence modeling of dynamical systems. The performance of Mamba-MPC is evaluated on several numerical examples and compared to a Long-Short-Term-Memory based MPC (LSTM-MPC) equivalent. First, a Single-Input-Single-Output (SISO) Van der Pol oscillator is considered, where stability, reference tracking, and noise robustness are evaluated. Then, a MIMO Four Tank setup is introduced where Multiple-Input-Multiple-Output (MIMO) reference tracking is evaluated. Lastly, Mamba-MPC is implemented on a physical Quanser Aero2 setup for closed-loop reference tracking. The results demonstrate that Mamba-MPC is able to stabilize and track a reference for SISO and MIMO systems, both in simulation and on a physical setup. Moreover, Mamba-MPC consistently outperforms LSTM-MPC in predictive control and is significantly computationally faster.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Mamba-MPC shows it can run closed-loop on hardware and beats LSTM in speed on the tested cases, but the lack of open-loop prediction metrics leaves the modeling advantage unproven.

read the letter

The main point is that this paper gives a clean mathematical description of Mamba, makes the specific changes needed for a decoder-only multi-step predictor, and casts it in an input-output sequence format that fits dynamical systems. They then close the loop on Van der Pol, the four-tank system, and actual Quanser Aero2 hardware, reporting better tracking and much lower compute than an LSTM-MPC baseline. That combination of formulation plus hardware test is the concrete contribution here.

Referee Report

2 major / 1 minor

Summary. The paper proposes Mamba-based Model Predictive Control by giving a mathematical description of the Mamba architecture, adapting it into a decoder-only form for multi-step ahead prediction of unknown dynamics via an input-output sequence-to-sequence model, and testing the resulting Mamba-MPC controller on a SISO Van der Pol oscillator (stability, tracking, noise robustness), a MIMO four-tank system (reference tracking), and a physical Quanser Aero2 setup (closed-loop tracking). It claims that Mamba-MPC achieves the desired closed-loop behavior, consistently outperforms an equivalent LSTM-MPC baseline, and runs significantly faster.

Significance. If the empirical results can be supported by quantitative open-loop prediction metrics and reproducibility details, the work would be significant for introducing an efficient sequence-modeling alternative to LSTM in data-driven MPC, exploiting Mamba's sub-quadratic scaling for potentially longer horizons or real-time hardware use. The hardware experiment is a positive practical element.

major comments (2)

[Abstract and Evaluation sections] Abstract and Evaluation sections: the claims of stabilization, reference tracking, and outperformance versus LSTM-MPC rest entirely on closed-loop experiments, yet no open-loop multi-step prediction metrics (horizon-dependent MSE or similar on held-out trajectories) are reported for the trained Mamba model. Without these numbers the observed closed-loop success cannot be confidently attributed to superior dynamics modeling rather than cost tuning or plant simplicity.
[Numerical examples] Numerical examples (Van der Pol, Four Tank, Quanser Aero2): no training details, dataset generation procedure, hyperparameter values, or ablation studies are supplied for the Mamba or LSTM models. This prevents assessment of whether the reported superiority and speed advantage arise from the architecture itself.

minor comments (1)

[Abstract] Abstract contains a grammatical error: 'We provide a consistent and complete mathematical description of the Mamba neural network is provided.'

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. We appreciate the positive note on the hardware experiment and the potential significance of the work. We address each major comment below and commit to revisions that directly respond to the concerns raised.

read point-by-point responses

Referee: [Abstract and Evaluation sections] Abstract and Evaluation sections: the claims of stabilization, reference tracking, and outperformance versus LSTM-MPC rest entirely on closed-loop experiments, yet no open-loop multi-step prediction metrics (horizon-dependent MSE or similar on held-out trajectories) are reported for the trained Mamba model. Without these numbers the observed closed-loop success cannot be confidently attributed to superior dynamics modeling rather than cost tuning or plant simplicity.

Authors: We agree that open-loop multi-step prediction metrics would strengthen the attribution of closed-loop performance to the quality of the learned model. The manuscript currently emphasizes closed-loop results because that is the ultimate objective of MPC, but we recognize the value of isolating modeling accuracy. In the revised version we will add a dedicated subsection reporting horizon-dependent MSE (and similar metrics) on held-out trajectories for both Mamba and LSTM predictors. These results will be presented alongside the existing closed-loop experiments to allow readers to assess whether performance differences arise from superior dynamics modeling. revision: yes
Referee: [Numerical examples] Numerical examples (Van der Pol, Four Tank, Quanser Aero2): no training details, dataset generation procedure, hyperparameter values, or ablation studies are supplied for the Mamba or LSTM models. This prevents assessment of whether the reported superiority and speed advantage arise from the architecture itself.

Authors: We acknowledge the omission of these details in the original submission. The revised manuscript will include a new section (or expanded appendix) that provides: (i) the exact dataset generation procedure for each example, including excitation signals, number and length of trajectories, and train/validation/test splits; (ii) all hyperparameter values and architectural choices for both the Mamba and LSTM models; (iii) training settings (optimizer, learning rate schedule, epochs, loss, hardware); and (iv) ablation studies on key Mamba parameters (e.g., state dimension, number of layers) and their effect on prediction accuracy and inference speed. We will also release code and datasets to support reproducibility. These additions will make clear that the observed advantages are due to Mamba's architecture and scaling properties. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical performance claims rest on direct experiments, not self-referential derivations

full rationale

The paper supplies a mathematical description of the Mamba architecture, decoder-only adjustments, and an input-output sequence-to-sequence formulation for dynamical systems, then reports closed-loop results on Van der Pol, Four Tank, and Quanser Aero2 plants. These results are obtained by training the network and running MPC experiments; no equation or claim reduces by construction to a fitted parameter, self-citation, or ansatz imported from the authors' prior work. The central assertions (stabilization, reference tracking, outperformance of LSTM-MPC) are falsifiable empirical statements rather than tautological predictions. This matches the default expectation that most papers contain no circularity.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the empirical success of trained Mamba networks as multi-step predictors inside an MPC loop; no explicit free parameters beyond standard neural-network weights are named, and no new entities are postulated.

free parameters (1)

Mamba network weights
Trained parameters of the neural network used as the multi-step predictor; their values are fitted to data and not derived from first principles.

axioms (1)

domain assumption Mamba architecture can be modified into a decoder-only multi-step predictor while retaining its computational scaling advantages for dynamical system modeling.
Invoked when the authors state adjustments are made to construct the MPC predictor.

pith-pipeline@v0.9.0 · 5535 in / 1267 out tokens · 66644 ms · 2026-05-10T12:28:21.776561+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 8 canonical work pages · 4 internal anchors

[1]

Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its applica- tion to dynamical systems,

T. Chen and H. Chen, “Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its applica- tion to dynamical systems,”IEEE Transactions on Neural Networks, vol. 6, pp. 911–917, 1995

1995
[2]

A neural predictive controller for non- linear systems,

M. Lazar and O. Pastravanu, “A neural predictive controller for non- linear systems,”Mathematics and Computers in Simulation, vol. 60, pp. 315–324, 2002

2002
[3]

Model predictive control for nonlinear affine systems based on the simplified dual neural network,

Y . Pan and J. Wang, “Model predictive control for nonlinear affine systems based on the simplified dual neural network,” inIEEE Con- ference on Control Applications (CCA) & Intelligent Control (ISIC), 2009, pp. 683–688

2009
[4]

Computationally efficient predictive control based on ANN state-space models,

J. H. Hoekstra, B. Cseppento, G. I. Beintema, M. Schoukens, Z. Kollr, and R. Tth, “Computationally efficient predictive control based on ANN state-space models,” inProceedings of the IEEE Conference on Decision and Control. Institute of Electrical and Electronics Engineers Inc., 2023, pp. 6336–6341

2023
[5]

Long short-term memory,

S. Hochreiter and J. Schmidhuber, “Long short-term memory,”Neural Computation, vol. 9, pp. 1735–1780, nov 1997

1997
[6]

Multistep Prediction of Dynamic Systems With Recurrent Neural Networks,

N. Mohajerin and S. L. Waslander, “Multistep Prediction of Dynamic Systems With Recurrent Neural Networks,” vol. 30, no. 11, pp. 3370–3383. [Online]. Available: https://ieeexplore.ieee.org/document/ 8630673/
[7]

Learning Phrase Representations using RNN EncoderDecoder for Statistical Machine Translation,

K. Cho, B. Van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y . Bengio, “Learning Phrase Representations using RNN EncoderDecoder for Statistical Machine Translation,” inProceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, pp. 1724–1734. [Online]. ...

2014
[8]

Nonlinear MPC for offset-free tracking of systems learned by GRU neural networks,

F. Bonassi, C. F. O. da Silva, and R. Scattolini, “Nonlinear MPC for offset-free tracking of systems learned by GRU neural networks,” 2021

2021
[9]

Attention is All you Need

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, . Kaiser, and I. Polosukhin, “Attention is All you Need.”
[10]

Simultaneous multistep Transformer architecture for model predictive control,

J. Park, M. R. Babaei, S. A. Munoz, A. N. Venkat, and J. D. Heden- gren, “Simultaneous multistep Transformer architecture for model predictive control,”Computers and Chemical Engineering, vol. 178, oct 2023

2023
[11]

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

A. Gu and T. Dao, “Mamba: Linear-time sequence modeling with selective state spaces,” arXiv preprint arXiv:2312.00752, dec 2023. [Online]. Available: http://arxiv.org/abs/2312.00752

work page internal anchor Pith review Pith/arXiv arXiv 2023
[12]

Efficiently Modeling Long Sequences with Structured State Spaces

A. Gu, K. Goel, and C. R, “Efficiently modeling long sequences with structured state spaces,” arXiv preprint arXiv:2111.00396, oct 2021. [Online]. Available: http://arxiv.org/abs/2111.00396

work page internal anchor Pith review arXiv 2021
[13]

Canrecurrentneuralnetworkswarptime? arXiv:1804.11188,

C. Tallec and Y . Ollivier, “Can recurrent neural networks warp time?” arXiv preprint arXiv:1804.11188, mar 2018. [Online]. Available: http://arxiv.org/abs/1804.11188

work page arXiv 2018
[14]

DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators,

L. Lu, P. Jin, and G. E. Karniadakis, “DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators,”Nature Machine Intelligence, vol. 3, pp. 218–229, mar 2021. 12 VOLUME

2021
[15]

Z. Hu, N. A. Daryakenari, Q. Shen, K. Kawaguchi, and G. E. Karniadakis. State-Space Models are Accurate and Efficient Neural Operators for Dynamical Systems. [Online]. Available: https: //www.ssrn.com/abstract=4990229
[16]

MamKO: Mamba-based koopman operator for modeling and predictive control,

Z. Li, M. Han, and X. Yin, “MamKO: Mamba-based koopman operator for modeling and predictive control,” inThe Thirteenth International Conference on Learning Representations, 2025. [Online]. Available: https://openreview.net/forum?id=hNjCVVm0EQ

2025
[17]

Deep Operator Neural Network Model Predictive Control,

T. O. De Jong, K. Shukla, and M. Lazar, “Deep Operator Neural Network Model Predictive Control,” vol. 4, pp. 501–517. [Online]. Available: https://ieeexplore.ieee.org/document/11181185/

work page arXiv
[18]

GLU Variants Improve Transformer

N. Shazeer, “GLU variants improve Transformer,” arXiv preprint arXiv:2002.05202, feb 2020. [Online]. Available: http://arxiv.org/abs/ 2002.05202

work page internal anchor Pith review arXiv 2002
[20]

Cheap orthogonal constraints in neural networks: A simple parametrization of the orthogonal and unitary group,

[Online]. Available: http://arxiv.org/abs/1901.08428

work page arXiv 1901
[21]

Beyond BatchNorm: Towards a unified understanding of normalization in deep learning,

E. S. Lubana, R. P. Dick, and H. Tanaka, “Beyond BatchNorm: Towards a unified understanding of normalization in deep learning,” arXiv preprint arXiv:2106.05956, oct 2021. [Online]. Available: http://arxiv.org/abs/2106.05956

work page arXiv 2021
[22]

Root Mean Square Layer Normalization

B. Zhang and R. Sennrich, “Root Mean Square Layer Normalization.” [Online]. Available: https://www.zora.uzh.ch/id/eprint/177483
[23]

LSTM and GRU type recurrent neural networks in model predictive control: A review,

M. awryczuk and K. Zarzycki, “LSTM and GRU type recurrent neural networks in model predictive control: A review,”Neurocomputing, vol. 632, jun 2025

2025
[24]

Adam: A Method for Stochastic Optimization

D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization.” [Online]. Available: http://arxiv.org/abs/1412.6980

work page internal anchor Pith review Pith/arXiv arXiv
[25]

On the implementation of an interior- point filter line-search algorithm for large-scale nonlinear program- ming,

A. Wchter and L. T. Biegler, “On the implementation of an interior- point filter line-search algorithm for large-scale nonlinear program- ming,”Mathematical Programming, vol. 106, pp. 25–57, may 2006

2006
[26]

Four MPC implementations compared on the Quadruple Tank Process benchmark: pros and cons of neural MPC,

P. C. Blaud, P. Chevrel, F. Claveau, P. Haurant, and A. Mouraud, “Four MPC implementations compared on the Quadruple Tank Process benchmark: pros and cons of neural MPC,” inIFAC-PapersOnLine, vol. 55. Elsevier B.V ., jul 2022, pp. 344–349. Michiel Cevaalreceived the B.S. degree in Elec- trical Engineering from Eindhoven University of Technology, Eindhoven...

2022