Real-World Deployment of Massively Parallel Sampling-Based MPC for Contact-Rich Manipulation

An Thai Le; Georgia Chalvatzaki; Jan Peters; Joao Carvalho; Magnus Dierking

arxiv: 2606.20712 · v1 · pith:3SY7EESInew · submitted 2026-06-16 · 💻 cs.RO

Real-World Deployment of Massively Parallel Sampling-Based MPC for Contact-Rich Manipulation

Magnus Dierking , Joao Carvalho , An Thai Le , Georgia Chalvatzaki , Jan Peters This is my paper

Pith reviewed 2026-06-27 01:02 UTC · model grok-4.3

classification 💻 cs.RO

keywords sampling-based MPCcontact-rich manipulationsim-to-real transferdomain randomizationparallel GPU simulationmodel predictive controlrobot manipulationPush-T task

0 comments

The pith

Sampling-based MPC with structured global sampling outperforms standard methods on real contact-rich robot manipulation tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows a full real-to-sim-to-real deployment of sampling-based model predictive control for contact-rich tasks such as object pushing with a Franka arm. It uses JAX to run large numbers of simulations in parallel inside the MuJoCo MJX engine so that many possible trajectories can be scored at each control step. The main result is that the MTP variant, which applies structured global sampling, beats common unimodal samplers like CEM, MPPI, and PS on tasks that require switching between different contact modes, and this advantage holds both in simulation and on the physical robot. Tests of online domain randomization inside the MPC loop further show that parameters tied to contact initiation give clearer and faster adaptation signals than global physics parameters at normal replanning rates. These outcomes identify concrete limits that sampling-based controllers face when moved from simulation to hardware in manipulation settings.

Core claim

The MTP variant with structured global sampling outperforms unimodal baselines such as CEM, MPPI, and PS across tasks that require mode switching, both in simulation and on hardware. The framework is built with JAX for large-scale parallelization and the high-fidelity MuJoCo MJX simulator, deployed on a Franka Research 3 for the Push-T manipulation task through a real-to-sim-to-real pipeline. Online domain randomization within the MPC sample budget shows that contact-initiation parameters yield interpretable adaptation signals, while global physics parameters provide feedback that is too weak for reliable exploitation at typical replanning frequencies.

What carries the argument

MTP variant with structured global sampling inside a JAX-parallelized MuJoCo MJX sampling-based MPC loop, which evaluates large batches of diverse trajectories to manage multimodal contact behavior.

If this is right

Structured global sampling enables better handling of contact mode switches than unimodal sampling in both simulation and real hardware.
The JAX-MuJoCo pipeline supports real-time deployment of massively parallel MPC on a standard robot arm for pushing tasks.
Contact-initiation parameters can be randomized online inside the MPC budget to produce usable adaptation signals.
Global physics parameters do not supply strong enough feedback to be exploited reliably at standard MPC replanning rates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Targeted randomization focused only on contact-related parameters may be necessary to make online adaptation practical within tight time limits.
The same parallel sampling approach could be tested on multi-step assembly or regrasping tasks to check how well it scales when more modes must be considered.
Tighter integration of real-time perception with the sampling process might reduce reliance on simulator fidelity for contact events.
Hybrid methods that use structured global sampling for mode discovery and then refine locally could further improve sample efficiency.

Load-bearing premise

The high-fidelity MuJoCo MJX simulator together with the described online domain randomization captures real-world contact dynamics closely enough to support effective sim-to-real transfer and useful adaptation signals at typical MPC replanning frequencies.

What would settle it

A direct comparison showing that the MTP controller does not produce higher success rates than CEM, MPPI, or PS on the physical robot during mode-switching contact tasks, or that contact-initiation randomization signals do not improve performance over a fixed-parameter baseline, would falsify the central claim.

Figures

Figures reproduced from arXiv: 2606.20712 by An Thai Le, Georgia Chalvatzaki, Jan Peters, Joao Carvalho, Magnus Dierking.

**Figure 2.** Figure 2: Real-world MPC experimental system overview based on ROS2. Blue arrows indicate ROS topic-based communication between processes. Purple arrows indicate Mocap streaming. two environments, framework optimizations, and a complete real-robot pipeline. Bugtrap Escape (fig. 1a) consists of a 2D point mass inside a U-shaped trap built from physical box primitives. The cost penalizes external contact forces and d… view at source ↗

**Figure 1.** Figure 1: Simulation environments and real-world setup. B. Model Tensor Planning Using a Gaussian distribution to model control sequences can lead to local minima. MTP [9] addresses this issue by mixing local Gaussian samples with global tensor-path samples VG drawn from a layered graph over the control space U. These are combined with local perturbations VL ∼ N (U, αΣ) into a joint batch of control signals V = [VG… view at source ↗

**Figure 4.** Figure 4: Sim-to-sim results for Push-T with long and short horizons. The number in the horizontal axis corresponds to a different initial state of the system. To test the exploration capabilities, we deliberately set the endeffector and T’s initial pose to hard-to-solve configurations. MTP achieves the lowest final pose error across seeds that require mode switching, while unimodal baselines stall in local minima … view at source ↗

**Figure 5.** Figure 5: Real-to-sim-to-real results on the Push T task. 0 2 4 6 Domain Index 0 18 37 55 74 93 111 130 Planning Step [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗

**Figure 7.** Figure 7: Real-world results of the MPC cost over time steps for different domain-randomization strategies. Additionally, global physics parameters (mass, friction) are effectively non-identifiable within a single replanning interval, whereas contact-initiation parameters (collision margins) directly gate contact onset, yielding interpretable adaptation signals and measurable robustness gains. This is a structural … view at source ↗

read the original abstract

Sampling-based Model Predictive Control (SMPC) is a promising strategy for contact-rich robotic manipulation, combining gradient-free optimization with massively parallel GPU simulation. Yet, most prior work relies on simplified dynamics or remains confined to simulation. We present an MPC framework that leverages JAX for large-scale parallelization and efficient computation, coupled with the high-fidelity MuJoCo MJX simulator, and deploy it on a Franka Research 3 executing the Push-T manipulation task through a complete real-to-sim-to-real pipeline. The MTP variant with structured global sampling outperforms unimodal baselines such as CEM, MPPI, and PS across tasks that require mode switching, both in simulation and on hardware. Furthermore, we evaluate online domain randomization within the MPC sample budget, showing that contact-initiation parameters yield interpretable adaptation signals, whereas global physics parameters provide feedback that is too weak for reliable exploitation at typical replanning frequencies. These findings highlight key challenges for sampling-based MPC in contact-rich manipulation-contact sensitivity, tight compute budgets, and the difficulty of obtaining informative domain-randomization signals in real time.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper shows real Franka hardware results for parallel SMPC on Push-T, with MTP structured sampling beating baselines and some honest notes on weak domain-randomization signals.

read the letter

The main point is that they ran a full real-to-sim-to-real pipeline with JAX and MuJoCo MJX to put massively parallel sampling MPC on a Franka for the Push-T task. The MTP variant with structured global sampling outperforms CEM, MPPI, and PS on mode-switching contact in both simulation and on the actual robot, and the experiments include quantitative comparisons plus analysis of online domain randomization inside the MPC loop.

The work does a solid job moving past simulation-only claims. It supplies direct hardware data on the robot, shows which parameters give usable adaptation signals (contact initiation) versus which do not (global physics at replanning rates), and flags the practical limits around contact sensitivity and compute budgets. The stress-test note confirms the experimental sections back the central claim without internal contradictions.

The soft spots are mostly about scope. All results sit on one task, so it is unclear how far the mode-switching advantage travels to other contact sequences or platforms. The domain-randomization findings are useful but largely negative, and the compute demands remain high even with GPU parallelization. These are real but not load-bearing issues for the reported results.

This is for people working on sampling-based control and sim-to-real in manipulation who want deployment evidence rather than new theory. It deserves peer review because the hardware numbers address a gap that simulation papers leave open.

Referee Report

0 major / 3 minor

Summary. The manuscript presents a JAX-based massively parallel sampling-based MPC framework paired with the high-fidelity MuJoCo MJX simulator. It describes a complete real-to-sim-to-real pipeline and deploys the controller on a Franka Research 3 robot for the Push-T contact-rich manipulation task. The central claim is that the MTP variant employing structured global sampling outperforms unimodal baselines (CEM, MPPI, PS) on mode-switching tasks in both simulation and hardware experiments. The work further evaluates online domain randomization within the MPC sample budget and reports that contact-initiation parameters produce interpretable adaptation signals while global physics parameters yield signals too weak for reliable use at typical replanning frequencies.

Significance. If the reported hardware results hold, the paper supplies concrete evidence that large-scale parallel SMPC can be deployed on real hardware for contact-rich manipulation without simplified dynamics. The quantitative comparisons on the Franka Push-T task together with the adaptation-signal analysis constitute a useful empirical contribution to the robotics community. The explicit discussion of compute-budget and contact-sensitivity limitations provides practical guidance that is often missing from simulation-only studies.

minor comments (3)

[Abstract] Abstract: the statement that MTP 'outperforms' the baselines would be strengthened by a one-sentence summary of the key quantitative metrics (e.g., success rate or cost reduction) obtained on hardware.
[Method / Experiments] The description of the online domain-randomization procedure would benefit from an explicit statement of the parameter ranges and the exact number of samples allocated to randomization versus task optimization within each replanning cycle.
[Experiments] Figure captions for the hardware results should include the number of independent trials and any statistical test used to support the reported performance differences.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our work, the recognition of its empirical contribution to real-world deployment of massively parallel SMPC, and the recommendation for minor revision. No major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity; purely empirical claims

full rationale

The paper presents an empirical MPC deployment study with performance comparisons between sampling variants (MTP vs. CEM/MPPI/PS) on simulation and hardware tasks. No derivation chain, equations, fitted parameters, or self-citation load-bearing premises are described in the abstract or reader summary. Claims rest on direct experimental results rather than any reduction of outputs to inputs by construction. This matches the default expectation for non-circular empirical work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the unverified fidelity of the MuJoCo MJX simulator for contact dynamics and the assumption that real-time compute budgets allow exploitation of domain-randomization signals.

axioms (1)

domain assumption MuJoCo MJX provides high-fidelity simulation of contact dynamics sufficient for sim-to-real transfer in the Push-T task.
Invoked by the real-to-sim-to-real pipeline and the claim that online domain randomization yields interpretable signals.

pith-pipeline@v0.9.1-grok · 5730 in / 1382 out tokens · 49984 ms · 2026-06-27T01:02:12.496424+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

17 extracted references · 1 canonical work pages

[1]

Bradbury, R

J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. VanderPlas, S. Wanderman- Milne, and Q. Zhang. (2018) Jax: Composable transformations of python+numpy programs. Software repository. [Online]. Available: https://github.com/jax-ml/jax

2018
[2]

(2025) Mujoco documentation: Computation

DeepMind. (2025) Mujoco documentation: Computation. [Online]. Available: https://mujoco.readthedocs.io/en/stable/computation/index. html#geintegration

2025
[3]

Full-Order Sampling- Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing,

H. Xue, C. Pan, Z. Yi, G. Qu, and G. Shi, “Full-Order Sampling- Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing,” Sep. 2024

2024
[4]

Real-time whole-body control of legged robots with model- predictive path integral control,

J. Alvarez-Padilla, J. Z. Zhang, S. Kwok, J. M. Dolan, and Z. Manch- ester, “Real-time whole-body control of legged robots with model- predictive path integral control,” in2025 IEEE International Conference on Robotics and Automation (ICRA), 2025, pp. 14 721–14 727

2025
[5]

Feedback- mppi: Fast sampling-based mpc via rollout differentiation – adios low- level controllers,

T. Belvedere, M. Ziegltrum, G. Turrisi, and V . Modugno, “Feedback- mppi: Fast sampling-based mpc via rollout differentiation – adios low- level controllers,”IEEE Robotics and Automation Letters, vol. 11, no. 1, pp. 1–8, 2026

2026
[6]

Sampling-based Model Predictive Control Leveraging Parallelizable Physics Simulations,

C. Pezzato, C. Salmi, E. Trevisan, M. Spahn, J. Alonso-Mora, and C. H. Corbato, “Sampling-based Model Predictive Control Leveraging Parallelizable Physics Simulations,” Jan. 2025

2025
[7]

Model Predictive Path Integral Control: From Theory to Parallel Computation,

G. Williams, A. Aldrich, and E. A. Theodorou, “Model Predictive Path Integral Control: From Theory to Parallel Computation,”Journal of Guidance, Control, and Dynamics, vol. 40, no. 2, pp. 344–357, Feb. 2017

2017
[8]

Sample-efficient Cross-Entropy Method for Real-time Planning,

C. Pinneri, S. Sawant, S. Blaes, J. Achterhold, J. Stueckler, M. Rolinek, and G. Martius, “Sample-efficient Cross-Entropy Method for Real-time Planning,” Aug. 2020

2020
[9]

Model Tensor Planning,

A. T. Le, K. Nguyen, M. N. Vu, J. Carvalho, and J. Peters, “Model Tensor Planning,” May 2025

2025
[10]

Predictive Sampling: Real-time Behaviour Synthesis with MuJoCo,

T. Howell, N. Gileadi, S. Tunyasuvunakool, K. Zakka, T. Erez, and Y . Tassa, “Predictive Sampling: Real-time Behaviour Synthesis with MuJoCo,” Dec. 2022

2022
[11]

Information theoretic MPC for model-based reinforcement learning,

G. Williams, N. Wagener, B. Goldfain, P. Drews, J. M. Rehg, B. Boots, and E. A. Theodorou, “Information theoretic MPC for model-based reinforcement learning,” in2017 IEEE International Conference on Robotics and Automation (ICRA). Singapore: IEEE, May 2017, pp. 1714–1721

2017
[12]

Model-Based Generalization Under Parameter Uncertainty Using Path Integral Control,

I. Abraham, A. Handa, N. Ratliff, K. Lowrey, T. D. Murphey, and D. Fox, “Model-Based Generalization Under Parameter Uncertainty Using Path Integral Control,”IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 2864–2871, Apr. 2020

2020
[13]

V . Kurtz. (2024) Hydrax. Software repository. [Online]. Available: https://github.com/vincekurtz/hydrax

2024
[14]

Robot Operating System 2: Design, architecture, and uses in the wild,

S. Macenski, T. Foote, B. Gerkey, C. Lalancette, and W. Woodall, “Robot operating system 2: Design, architecture, and uses in the wild,” vol. 7, no. 66, p. eabm6074, 2022. [Online]. Available: https://www.science.org/doi/abs/10.1126/scirobotics.abm6074

work page doi:10.1126/scirobotics.abm6074 2022
[15]

Moveit! task construc- tor for task-level motion planning,

M. G ¨orner, R. Haschke, H. Ritter, and J. Zhang, “Moveit! task construc- tor for task-level motion planning,” in2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 190–196

2019
[16]

A new technique for fully autonomous and effi- cient 3d robotics hand/eye calibration,

R. Tsai and R. Lenz, “A new technique for fully autonomous and effi- cient 3d robotics hand/eye calibration,”IEEE Transactions on Robotics and Automation, vol. 5, no. 3, pp. 345–358, 1989

1989
[17]

Domain randomization for transferring deep neural networks from simulation to the real world,

J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel, “Domain randomization for transferring deep neural networks from simulation to the real world,” in2017 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 2017, pp. 23–30

2017

[1] [1]

Bradbury, R

J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. VanderPlas, S. Wanderman- Milne, and Q. Zhang. (2018) Jax: Composable transformations of python+numpy programs. Software repository. [Online]. Available: https://github.com/jax-ml/jax

2018

[2] [2]

(2025) Mujoco documentation: Computation

DeepMind. (2025) Mujoco documentation: Computation. [Online]. Available: https://mujoco.readthedocs.io/en/stable/computation/index. html#geintegration

2025

[3] [3]

Full-Order Sampling- Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing,

H. Xue, C. Pan, Z. Yi, G. Qu, and G. Shi, “Full-Order Sampling- Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing,” Sep. 2024

2024

[4] [4]

Real-time whole-body control of legged robots with model- predictive path integral control,

J. Alvarez-Padilla, J. Z. Zhang, S. Kwok, J. M. Dolan, and Z. Manch- ester, “Real-time whole-body control of legged robots with model- predictive path integral control,” in2025 IEEE International Conference on Robotics and Automation (ICRA), 2025, pp. 14 721–14 727

2025

[5] [5]

Feedback- mppi: Fast sampling-based mpc via rollout differentiation – adios low- level controllers,

T. Belvedere, M. Ziegltrum, G. Turrisi, and V . Modugno, “Feedback- mppi: Fast sampling-based mpc via rollout differentiation – adios low- level controllers,”IEEE Robotics and Automation Letters, vol. 11, no. 1, pp. 1–8, 2026

2026

[6] [6]

Sampling-based Model Predictive Control Leveraging Parallelizable Physics Simulations,

C. Pezzato, C. Salmi, E. Trevisan, M. Spahn, J. Alonso-Mora, and C. H. Corbato, “Sampling-based Model Predictive Control Leveraging Parallelizable Physics Simulations,” Jan. 2025

2025

[7] [7]

Model Predictive Path Integral Control: From Theory to Parallel Computation,

G. Williams, A. Aldrich, and E. A. Theodorou, “Model Predictive Path Integral Control: From Theory to Parallel Computation,”Journal of Guidance, Control, and Dynamics, vol. 40, no. 2, pp. 344–357, Feb. 2017

2017

[8] [8]

Sample-efficient Cross-Entropy Method for Real-time Planning,

C. Pinneri, S. Sawant, S. Blaes, J. Achterhold, J. Stueckler, M. Rolinek, and G. Martius, “Sample-efficient Cross-Entropy Method for Real-time Planning,” Aug. 2020

2020

[9] [9]

Model Tensor Planning,

A. T. Le, K. Nguyen, M. N. Vu, J. Carvalho, and J. Peters, “Model Tensor Planning,” May 2025

2025

[10] [10]

Predictive Sampling: Real-time Behaviour Synthesis with MuJoCo,

T. Howell, N. Gileadi, S. Tunyasuvunakool, K. Zakka, T. Erez, and Y . Tassa, “Predictive Sampling: Real-time Behaviour Synthesis with MuJoCo,” Dec. 2022

2022

[11] [11]

Information theoretic MPC for model-based reinforcement learning,

G. Williams, N. Wagener, B. Goldfain, P. Drews, J. M. Rehg, B. Boots, and E. A. Theodorou, “Information theoretic MPC for model-based reinforcement learning,” in2017 IEEE International Conference on Robotics and Automation (ICRA). Singapore: IEEE, May 2017, pp. 1714–1721

2017

[12] [12]

Model-Based Generalization Under Parameter Uncertainty Using Path Integral Control,

I. Abraham, A. Handa, N. Ratliff, K. Lowrey, T. D. Murphey, and D. Fox, “Model-Based Generalization Under Parameter Uncertainty Using Path Integral Control,”IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 2864–2871, Apr. 2020

2020

[13] [13]

V . Kurtz. (2024) Hydrax. Software repository. [Online]. Available: https://github.com/vincekurtz/hydrax

2024

[14] [14]

Robot Operating System 2: Design, architecture, and uses in the wild,

S. Macenski, T. Foote, B. Gerkey, C. Lalancette, and W. Woodall, “Robot operating system 2: Design, architecture, and uses in the wild,” vol. 7, no. 66, p. eabm6074, 2022. [Online]. Available: https://www.science.org/doi/abs/10.1126/scirobotics.abm6074

work page doi:10.1126/scirobotics.abm6074 2022

[15] [15]

Moveit! task construc- tor for task-level motion planning,

M. G ¨orner, R. Haschke, H. Ritter, and J. Zhang, “Moveit! task construc- tor for task-level motion planning,” in2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 190–196

2019

[16] [16]

A new technique for fully autonomous and effi- cient 3d robotics hand/eye calibration,

R. Tsai and R. Lenz, “A new technique for fully autonomous and effi- cient 3d robotics hand/eye calibration,”IEEE Transactions on Robotics and Automation, vol. 5, no. 3, pp. 345–358, 1989

1989

[17] [17]

Domain randomization for transferring deep neural networks from simulation to the real world,

J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel, “Domain randomization for transferring deep neural networks from simulation to the real world,” in2017 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 2017, pp. 23–30

2017