Constrained Whole-Body Tracking for Humanoid Robots

Daniel Morton; Marco Pavone; Pranit Mohnot

arxiv: 2606.00374 · v1 · pith:VSQIHBQLnew · submitted 2026-05-29 · 💻 cs.RO

Constrained Whole-Body Tracking for Humanoid Robots

Daniel Morton , Pranit Mohnot , Marco Pavone This is my paper

Pith reviewed 2026-06-28 21:49 UTC · model grok-4.3

classification 💻 cs.RO

keywords humanoid robotswhole-body trackingreinforcement learningcontrol barrier functionsoperational space controlconstraint satisfactionteleoperation

0 comments

The pith

A control framework integrates operational space control and control barrier functions to enforce arbitrary runtime constraints on humanoid robot reinforcement learning policies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a method to add safety constraints to already-trained reinforcement learning policies for humanoid robots. It combines operational space control with control barrier functions so that limits on motion and dynamics can be enforced in real time. The approach stays consistent with the robot's current contacts and its original tracking goals. Experiments on a simulated humanoid show the framework handling collision avoidance, joint limits, and center-of-mass stability at high speeds.

Core claim

ConstrainedMimic leverages whole-body kinematics and dynamics for real-time constraint enforcement within RL tracking policies. By integrating principles from operational space control and control barrier functions, it enables the satisfaction of arbitrary runtime constraints on both the kinematic reference motion and the underlying dynamics while remaining consistent with the current contact mode and tracking objectives.

What carries the argument

ConstrainedMimic framework, which applies operational space control and control barrier functions to enforce constraints on kinematic references and dynamics inside RL tracking policies.

If this is right

Collision avoidance with the robot body and external obstacles can be enforced during whole-body tracking.
Joint limits and center-of-mass stability constraints can be satisfied at runtime.
Policy capabilities are minimally restricted when constraints become active.
The method remains fully differentiable and runs at frequencies up to 300-500 Hz on CPU, GPU, or TPU.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same constraint layer could be applied to policies trained for other contact-rich tasks such as locomotion or manipulation.
Because the method is differentiable it may support future end-to-end training that includes constraint satisfaction as an objective.
Deployment on physical robots would require testing against model mismatch and sensor noise not present in simulation.

Load-bearing premise

The integration of operational space control and control barrier functions can enforce constraints while remaining consistent with the current contact mode and tracking objectives.

What would settle it

A run of the framework on the simulated Unitree G1 where an active constraint such as collision avoidance or joint limit is violated during motion tracking.

Figures

Figures reproduced from arXiv: 2606.00374 by Daniel Morton, Marco Pavone, Pranit Mohnot.

**Figure 1.** Figure 1: Where does safety fit into a learning-based humanoid motion tracking stack? We approach safety from both the kinematics and dynamics levels, addressing safety from both sides (input and output) of the policy. On the kinematics side, constraints can naturally fit into the IK-based retargeting process between human and robot form-factors, or be applied as a safety filter on motions already mapped to the robo… view at source ↗

**Figure 2.** Figure 2: Self-collision constraint violation from the [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Handling deployment-time safety at both the kinematic and dynamics levels. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Ablation on the components of contact-constrained kinematic safety filters. [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Impact of short-horizon safety constraints on dynamic feasibility across discrete modes [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Tracking lower-body kinematic plans with SONIC for dynamic collision avoidance. Consider a humanoid standing at rest with a dynamic obstacle moving towards the robot ( [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: Compute frequencies for constrained whole-body tracking. [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗

read the original abstract

Recent advances in reinforcement learning (RL) have demonstrated impressive whole-body agility for humanoid robots, yet ensuring safety and satisfying constraints -- particularly those specified after training -- remains a challenge. Towards this goal, we present ConstrainedMimic, a control framework that leverages whole-body kinematics and dynamics for real-time constraint enforcement within RL tracking policies. By integrating principles from operational space control and control barrier functions (CBFs), we enable the satisfaction of arbitrary runtime constraints on both the kinematic reference motion and the underlying dynamics. In whole-body motion-tracking and teleoperation experiments on a (simulated) Unitree G1 with a learned policy, we demonstrate collision avoidance (both with the robot body and external obstacles), joint limits, and center of mass stability constraints. By remaining consistent with the current contact mode and tracking objectives, we minimally restrict the capabilities of the policy when constraints are active. Our method is fully differentiable, runs on CPU, GPU, and TPU, and can be deployed at up to 300-500 Hz. All software will be freely available upon publication.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ConstrainedMimic layers OSC and CBFs onto trained RL humanoid trackers to handle post-training constraints like collisions and joint limits, with fast runtime and a contact-mode consistency claim, but the simulation results stay light on numbers.

read the letter

The main takeaway is that this framework lets you specify and enforce arbitrary runtime constraints on an already-trained RL whole-body tracking policy for humanoids without retraining. It does so by combining operational space control with control barrier functions to act on both kinematics and dynamics while claiming to stay aligned with the current contact mode.

What the paper actually contributes is a named synthesis aimed at the practical gap between flexible RL policies and the need for safety constraints that arrive after training. The experiments on the simulated Unitree G1 cover collision avoidance with the robot body and external objects, joint limits, and center-of-mass stability, and the method is reported to run at 300-500 Hz across CPU, GPU, and TPU while remaining fully differentiable. Planning to release the software is also a clear positive for anyone who wants to test the approach.

The soft spots are mostly around evidence. The abstract states that the constraints are satisfied and that the policy is minimally restricted when they activate, but it supplies no quantitative tracking errors, success rates, or comparisons against baselines or unconstrained cases. That leaves open how much performance is traded off in practice. The stress-test concern about CBFs and discrete contact-mode switches is worth a close look in the full text; the paper asserts consistency with the instantaneous mode, yet standard CBF formulations assume smooth dynamics, so the handling of stance/swing transitions and unilateral forces needs explicit verification to support the claim.

This is for robotics researchers who already have RL policies for humanoids and need a way to add safety layers afterward. A reader focused on safe deployment or runtime filters would find the framing useful. It deserves peer review because the problem is real and the proposed integration is concrete, even if the current experiments would benefit from more rigorous quantification and mode-switch analysis.

Referee Report

2 major / 0 minor

Summary. The paper introduces ConstrainedMimic, a control framework integrating operational space control (OSC) and control barrier functions (CBFs) to enforce arbitrary runtime constraints on kinematic reference motion and underlying dynamics for RL-based whole-body tracking policies on humanoid robots. It reports simulation experiments on a Unitree G1 demonstrating collision avoidance (self and external), joint limits, and CoM stability, while claiming that the approach remains consistent with the current contact mode, minimally restricts policy capabilities when active, is fully differentiable, and runs at 300-500 Hz on CPU/GPU/TPU.

Significance. If the central integration of OSC and CBFs can be shown to enforce constraints while preserving contact-mode consistency and tracking objectives, the framework would provide a practical, post-training mechanism for adding safety constraints to learned humanoid policies without retraining. The emphasis on differentiability, high-frequency execution, and open-source release would strengthen reproducibility and applicability in real-time control.

major comments (2)

[Abstract] Abstract: the central claim that the OSC+CBF integration 'remains consistent with the current contact mode' is load-bearing for the 'minimally restrict' guarantee, yet the abstract supplies no quantitative results, error analysis, or description of how the Lie-derivative condition is preserved across discrete contact-mode switches (stance/swing, unilateral forces) that alter Jacobians and dynamics.
[Abstract] Abstract (weakest assumption): without explicit per-mode reformulation or mode-detection logic inside the barrier condition, standard CBFs defined on smooth continuous dynamics risk violation or overly conservative corrections at switches; the manuscript must demonstrate that the combined controller satisfies the barrier condition instantaneously at mode transitions while still tracking the reference.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on the abstract and the critical role of contact-mode consistency. We will revise the abstract to include quantitative metrics and add a dedicated clarification subsection on mode transitions to strengthen the presentation.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the OSC+CBF integration 'remains consistent with the current contact mode' is load-bearing for the 'minimally restrict' guarantee, yet the abstract supplies no quantitative results, error analysis, or description of how the Lie-derivative condition is preserved across discrete contact-mode switches (stance/swing, unilateral forces) that alter Jacobians and dynamics.

Authors: We agree the abstract lacks supporting numbers. In revision we will add quantitative results from the G1 experiments (contact-force deviation < 5 N and tracking RMSE during stance/swing switches) and briefly note that the OSC null-space projection preserves the Lie-derivative condition by construction before the CBF correction is applied. revision: yes
Referee: [Abstract] Abstract (weakest assumption): without explicit per-mode reformulation or mode-detection logic inside the barrier condition, standard CBFs defined on smooth continuous dynamics risk violation or overly conservative corrections at switches; the manuscript must demonstrate that the combined controller satisfies the barrier condition instantaneously at mode transitions while still tracking the reference.

Authors: The current formulation applies the CBF after the contact-consistent OSC projection, which empirically maintains the barrier condition at switches in our reported experiments. To make this explicit we will add a short analysis subsection showing instantaneous satisfaction (via recorded Lie-derivative values at detected transitions) and confirm that reference tracking error remains comparable to the unconstrained policy. revision: yes

Circularity Check

0 steps flagged

No significant circularity; synthesis of established OSC and CBF methods remains self-contained

full rationale

The paper presents ConstrainedMimic as an integration of operational space control and control barrier functions to enforce runtime constraints on RL tracking policies. No derivation step reduces by construction to fitted parameters, self-defined quantities, or load-bearing self-citations; the central claim is a synthesis of prior independent principles applied to humanoid tracking, with experimental validation on a simulated Unitree G1. The framework is described as fully differentiable and deployable without reference to any internal fit or renaming that would force the result. This matches the expected non-circular case for a methods paper combining known techniques.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim depends on the assumption that operational space control and CBF principles can be combined differentiably with RL policies for real-time use; no free parameters or new entities are described in the abstract.

axioms (1)

domain assumption Principles from operational space control and control barrier functions can be integrated into RL policies for real-time constraint enforcement on kinematics and dynamics.
Directly invoked in the abstract as the enabling step for the framework.

invented entities (1)

ConstrainedMimic no independent evidence
purpose: Control framework for constrained whole-body tracking in RL policies
Newly named method presented in the abstract; no independent evidence provided.

pith-pipeline@v0.9.1-grok · 5712 in / 1326 out tokens · 24948 ms · 2026-06-28T21:49:00.531859+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

58 extracted references · 36 canonical work pages · 7 internal anchors

[1]

Y . Ze, S. Zhao, W . W ang, A. Kanazawa, R. Duan, P . Abbeel, G. Shi, and J. W . C. K. Liu. T wist2: Scalable, portable, and holistic humanoid data collection system.arXiv preprint arXiv:2511.02832, 2025

work page arXiv 2025
[2]

Q. Liao, T . E. Truong, X. Huang, Y . Gao, G. T evet, K. Sreenath, and C. K. Liu. Beyondmimic: From motion tracking to versatile humanoid control via guided diffusion.arXiv preprint arXiv:2508.08241, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[3]

Z. Luo, Y . Y uan, T . W ang, C. Li, S. Chen, F . Casta˜neda, Z.-A. Cao, J. Li, D. Minor, Q. Ben, X. Da, R. Ding, C. Hogg, L. Song, E. Lim, E. Jeong, T . He, H. Xue, W . Xiao, Z. W ang, S. Y uen, J. Kautz, Y . Chang, U. Iqbal, L. Fan, and Y . Zhu. Sonic: Supersizing motion tracking for natural humanoid whole-body control.arXiv preprint arXiv:2511.07820, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[4]

Y . Ze, Z. Chen, J. P . Araujo, Z.-a. Cao, X. B. Peng, J. Wu, and K. Liu. T wist: T eleoperated whole-body imitation system. In J. Lim, S. Song, and H.-W . Park, editors,Proceedings of The 9th Conference on Robot Learning, volume 305 ofProceedings of Machine Learning Research, pages 2143–2154. PMLR, 27–30 Sep 2025. URLhttps://proceedings.mlr.press/v305/ze25a.html

2025
[5]

T . He, Z. Luo, X. He, W . Xiao, C. Zhang, W . Zhang, K. Kitani, C. Liu, and G. Shi. Omnih2o: Universal and dexterous human-to-humanoid whole-body teleoperation and learning. InConference on Robot Learning, 2024. URLhttps://api.semanticscholar.org/CorpusID:270440515

2024
[6]

Q. Ben, F . Jia, J. Zeng, J. Dong, D. Lin, and J. Pang. HOMIE: Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit. InProceedings of Robotics: Science and Systems, LosAngeles, CA, USA, June 2025. doi:10.15607/RSS.2025.XXI.070

work page doi:10.15607/rss.2025.xxi.070 2025
[7]

T . He, W . Xiao, T . Lin, Z. Luo, Z. Xu, Z. Jiang, J. Kautz, C. Liu, G. Shi, X. W ang, L. J. Fan, and Y . Zhu. Hover: V ersatile neural whole-body controller for humanoid robots. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 9989–9996, 2025. doi:10.1109/ ICRA55743.2025.11128549

work page arXiv 2025
[8]

A. D. Ames, S. Coogan, M. Egerstedt, G. Notomista, K. Sreenath, and P . T abuada. Control barrier functions: Theory and applications. In2019 18th European Control Conference (ECC), 2019. doi:10.23919/ECC.2019.8796030

work page doi:10.23919/ecc.2019.8796030 2019
[9]

S.-C. Hsu, X. Xu, and A. D. Ames. Control barrier function based quadratic programs with application to bipedal robotic walking. In2015 American Control Conference (ACC), pages 4542–4548, 2015. doi:10.1109/ACC.2015.7172044

work page doi:10.1109/acc.2015.7172044 2015
[10]

Nguyen, A

Q. Nguyen, A. Hereid, J. W . Grizzle, A. D. Ames, and K. Sreenath. 3d dynamic walking on stepping stones with control barrier functions. In2016 IEEE 55th Conference on Decision and Control (CDC), pages 827–834, 2016. doi:10.1109/CDC.2016.7798370

work page doi:10.1109/cdc.2016.7798370 2016
[11]

In: 2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids), pp

C. Khazoom, D. Gonzalez-Diaz, Y . Ding, and S. Kim. Humanoid self-collision avoidance using whole-body control with control barrier functions. In2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids), pages 558–565, 2022. doi:10.1109/Humanoids53995.2022. 10000235

work page doi:10.1109/humanoids53995.2022 2022
[12]

V . C. Paredes and A. Hereid. Safe whole-body task space control for humanoid robots. In2024 Amer- ican Control Conference (ACC), pages 949–956, 2024. doi:10.23919/ACC60939.2024.10644227. 9

work page doi:10.23919/acc60939.2024.10644227 2024
[13]

AgiBot World Colosseo: A large-scale manipulation platform for scalable and intelligent embodied systems,

L. Y ang, B. W erner, R. K. Cosner, D. Fridovich-Keil, P . Culbertson, and A. D. Ames. Shield: Safety on humanoids via cbfs in expectation on learned dynamics. In2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 203–210, 2025. doi:10.1109/IROS60139. 2025.11247065

work page doi:10.1109/iros60139 2025
[14]

CBF-RL: Safety Filtering Reinforcement Learning in Training with Control Barrier Functions

L. Y ang, B. W erner, M. de Sa, and A. D. Ames. Cbf-rl: Safety filtering reinforcement learning in training with control barrier functions.arXiv preprint arXiv:2510.14959, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[15]

Park and O

J. Park and O. Khatib. Contact consistent control framework for humanoid robots. InProceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006., pages 1963– 1969, 2006. doi:10.1109/ROBOT .2006.1641993

work page doi:10.1109/robot 2006
[16]

Khatib, M

O. Khatib, M. Jorda, J. Park, L. Sentis, and S.-Y . Chung. Constraint-consistent task-oriented whole- body robot formulation: T ask, posture, constraints, multiple contacts, and balance.The International Journal of Robotics Research, 41(13-14):1079–1098, 2022. doi:10.1177/02783649221120029. URL https://doi.org/10.1177/02783649221120029

work page doi:10.1177/02783649221120029 2022
[17]

Sentis.Synthesis and Control of Whole-Body Behaviors in Humanoid Systems

L. Sentis.Synthesis and Control of Whole-Body Behaviors in Humanoid Systems. Phd thesis, Stanford University, Stanford, CA, July 2007

2007
[18]

Kuindersma, R

S. Kuindersma, R. Deits, M. Fallon, A. V alenzuela, H. Dai, F . Permenter, T . Koolen, P . Marion, and R. Tedrake. Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot.Autonomous robots, 40(3):429–455, 2016

2016
[19]

P . M. W ensing, M. Posa, Y . Hu, A. Escande, N. Mansard, and A. D. Prete. Optimization-based control for dynamic legged robots.IEEE Transactions on Robotics, 40:43–63, 2024. doi:10.1109/ TRO.2023.3324580

work page arXiv 2024
[20]

J. P . Araujo, Y . Ze, P . Xu, J. Wu, and C. K. Liu. Retargeting matters: General motion retargeting for humanoid motion tracking.ArXiv, abs/2510.02252, 2025. URLhttps://api.semanticscholar. org/CorpusID:281724926

work page arXiv 2025
[21]

OmniRetarget: Interaction-Preserving Data Generation for Humanoid Whole-Body Loco-Manipulation and Scene Interaction

L. Y ang, X. Huang, Z. Wu, A. Kanazawa, P . Abbeel, C. Sferrazza, C. K. Liu, R. Duan, and G. Shi. Omniretarget: Interaction-preserving data generation for humanoid whole-body loco-manipulation and scene interaction.arXiv preprint arXiv:2509.26633, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[22]

C. M. Kim*, B. Yi*, H. Choi, Y . Ma, K. Goldberg, and A. Kanazawa. Pyroki: A modular toolkit for robot kinematic optimization. In2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025. URLhttps://arxiv.org/abs/2505.03728

work page arXiv 2025
[23]

K. Zakka. Mink: Python inverse kinematics based on MuJoCo, Feb. 2026. URLhttps://github. com/kevinzakka/mink

2026
[24]

Q. Lu, Y . Feng, B. Shi, M. Piseno, Z. Bao, and C. K. Liu. Gentlehumanoid: Learning upper-body compliance for contact-rich human and object interaction.arXiv preprint arXiv:2511.04679, 2025

work page arXiv 2025
[25]

Chen, Z.-A

S. Chen, Z.-A. Cao, Z. Luo, F . Casta˜neda, C. Li, T . W ang, Y . Y uan, L. Fan, C. K. Liu, and Y . Zhu. Chip: Learning adaptive compliance for humanoid control through hindsight perturbation.arXiv preprint arXiv:2512.14689, 2025

work page arXiv 2025
[26]

G. B. Margolis, M. W ang, N. Fey, and P . Agrawal. SoftMimic: Learning compliant whole-body control from examples.arXiv preprint arXiv:2510.17792, 2025

work page arXiv 2025
[27]

Y . Sun, Y . Pan, S. Li, C. Ding, T . Cui, L. W ang, and C. Liu. Learning safe-stoppability monitors for humanoid robots.arXiv preprint arXiv:2603.22703, 2026

work page arXiv 2026
[28]

Z. Meng, T . Liu, L. Ma, Y . Wu, R. Song, W . Zhang, and S. Huang. Safefall: Learning protective control for humanoid robots.arXiv preprint arXiv:2511.18509, 2026. 10

work page arXiv 2026
[29]

Strauch, D

P . Strauch, D. M¨uller, S. Christen, A. Serifi, R. Grandia, E. Knoop, and M. B¨acher. Robot crash course: Learning soft and stylized falling.arXiv preprint arXiv:2511.10635, 2025

work page arXiv 2025
[30]

Y . Sun, R. Chen, K. S. Y un, Y . Fang, S. Jung, F . Li, B. Li, W . Zhao, and C. Liu. SP ARK: Safe protective and assistive robot kit. InIF AC Symposium on Robotics, 2025. URL https: //intelligent-control-lab.github.io/spark/

2025
[31]

H. Xue, S. Liang, Z. Zhang, Z. Zeng, Y . Liu, Y . Lian, J. W ang, Q. Liu, X. Shi, and L. Yi. Collision-free humanoid traversal in cluttered indoor scenes, 2026. URLhttps://arxiv.org/abs/2601.16035

work page arXiv 2026
[32]

R. Chen, Y . Sun, and C. Liu. Dexterous safe control for humanoids in cluttered environments via projected safe set algorithm.arXiv preprint arXiv:2502.02858, 2025

work page arXiv 2025
[33]

frax: Fast Robot Kinematics and Dynamics in JAX

D. Morton and M. Pavone. frax: Fast robot kinematics and dynamics in jax.arXiv preprint arXiv:2604.04310, 2026. ICRA 2026 W orkshop on Frontiers of Optimization for Robotics

work page internal anchor Pith review Pith/arXiv arXiv 2026
[34]

Bimanual robot-assisted dressing: A spherical coordinate-based strategy for tight-fitting garments

D. Morton and M. Pavone. Safe, task-consistent manipulation with operational space control barrier functions. In2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 187–194, 2025. doi:10.1109/IROS60139.2025.11246389

work page doi:10.1109/iros60139.2025.11246389 2025
[35]

Englsberger, A

J. Englsberger, A. W erner, C. Ott, B. Henze, M. A. Roa, G. Garofalo, R. Burger, A. Beyer, O. Eiberger, K. Schmid, and A. Albu-Sch¨affer. Overview of the torque-controlled humanoid robot toro. In2014 IEEE-RAS International Conference on Humanoid Robots, pages 916–923, 2014. doi:10.1109/ HUMANOIDS.2014.7041473

work page arXiv 2014
[36]

Doppalapudi, B

W . Xiao and C. Belta. High-order control barrier functions.IEEE Transactions on Automatic Control, 67(7), 2022. doi:10.1109/T AC.2021.3105491

work page doi:10.1109/t 2022
[37]

Agrawal and K

A. Agrawal and K. Sreenath. Discrete control barrier functions for safety-critical control of discrete systems with application to bipedal robot navigation. InProceedings of Robotics: Science and Systems, Cambridge, Massachusetts, July 2017. doi:10.15607/RSS.2017.XIII.073

work page doi:10.15607/rss.2017.xiii.073 2017
[38]

D. R. Agrawal and D. Panagou. Safe control synthesis via input constrained control barrier functions. In2021 60th IEEE Conference on Decision and Control (CDC), pages 6113–6118, 2021. doi: 10.1109/CDC45484.2021.9682938

work page doi:10.1109/cdc45484.2021.9682938 2021
[39]

Flayols, A

T . Flayols, A. Del Prete, P . W ensing, A. Mifsud, M. Benallegue, and O. Stasse. Experimental evalua- tion of simple estimators for humanoid robots. In2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids), pages 889–895, 2017. doi:10.1109/HUMANOIDS.2017.8246977

work page doi:10.1109/humanoids.2017.8246977 2017
[40]

CoCo-InEKF: State Estimation with Learned Contact Covariances in Dynamic, Contact-Rich Scenarios

M. Baumgartner, D. M¨uller, A. Serifi, R. Grandia, E. Knoop, M. Gross, and M. B¨acher. Coco-inekf: State estimation with learned contact covariances in dynamic, contact-rich scenarios.arXiv preprint arXiv:2605.15122, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[41]

PICO Immersive Pte. Ltd. PICO 4 Ultra: An All-New Mixed Reality Experience.https://www. picoxr.com/global/products/pico4-ultra, 2023

2023
[42]

Z. Zhao, L. Y u, K. Jing, and N. Y ang. Xrobotoolkit: A cross-platform framework for robot teleoperation.2026 IEEE/SICE International Symposium on System Integration (SII), pages 15–20,

2026
[43]

URLhttps://api.semanticscholar.org/CorpusID:280417135
[44]

Arrizabalaga, K

J. Arrizabalaga, K. Tracy, and Z. Manchester. A differentiable interior-point method in single precision,
[45]

URLhttps://arxiv.org/abs/2605.17913

work page internal anchor Pith review Pith/arXiv arXiv
[46]

Tracy and Z

K. Tracy and Z. Manchester. On the differentiability of the primal-dual interior-point method.arXiv preprint arXiv:2406.11749v2, 2024. 11 Appendix A Background: Humanoid Kinematics and Dynamics 13 A.1 Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 A.2 Contact Kinematics . . . . . . . . . . . . . . . . . . . ....

work page arXiv 2024
[47]

--xla cpu multi thread eigen=false intra op parallelismthreads=1

or the comparisons between torque-control and velocity-control CBFs with torque limits in [34]. • Model mismatch, including imperfect actuators, miscalibrated inertial values, or unreliable contact mode estimation, can reduce the performance of the CBF when deployed on hardware. C Additional Implementation Details C.1 Timing and Performance Desktop timing...
[48]

W e use a PICO 4 Ultra [41], similar to [1], with a custom C++ ROS2 interface for the XRoboT oolkit SDK [42]

Record human reference data. W e use a PICO 4 Ultra [41], similar to [1], with a custom C++ ROS2 interface for the XRoboT oolkit SDK [42]. For teleoperation, high-frequency and smooth input data is critical to downstream performance, and this interface was designed to minimize latency and jitter.This custom software will also be made available on publication
[49]

Adjust the desired orientations of the feet to be parallel with the floor, to better suit our planar contact model
[50]

Compute the velocity and position of the feet and incorporate this into a simple contact estimation heuristic (described in Sec. C.2)
[51]

Rescale the positional data to approximately reflect the size difference between the human and Unitree G1
[52]

As previously mentioned, this assumes that mode 0 (no contact) is not considered

Update the heights of all bodies to put the lowest point on the feet at z= 0 . As previously mentioned, this assumes that mode 0 (no contact) is not considered. Constructing and solving the QP
[53]

Compute the error dynamics for the frame correspondences between the (pre-processed) human data and the current robot state
[54]

Compute the Jacobians for all frames on the robot body withfrax[33]
[55]

Compute the CBF terms withcbfpy[34]
[56]

P ost-processing (After the QP solve)

Construct the QP matrices and solve the problem withqpax[43, 44]. P ost-processing (After the QP solve)
[57]

Integrate the optimal ˙qaccording to the constrained kinematics
[58]

T o align the internal state of the solver with this initial pose, we iterate until convergence in a sequential quadratic programming (SQP) fashion

Apply an exponential moving average filter to the free-floating base velocities to ensure a smooth observation On initialization, the first human pose in the reference motion may be quite different from the default standing pose of the robot. T o align the internal state of the solver with this initial pose, we iterate until convergence in a sequential qu...

[1] [1]

Y . Ze, S. Zhao, W . W ang, A. Kanazawa, R. Duan, P . Abbeel, G. Shi, and J. W . C. K. Liu. T wist2: Scalable, portable, and holistic humanoid data collection system.arXiv preprint arXiv:2511.02832, 2025

work page arXiv 2025

[2] [2]

Q. Liao, T . E. Truong, X. Huang, Y . Gao, G. T evet, K. Sreenath, and C. K. Liu. Beyondmimic: From motion tracking to versatile humanoid control via guided diffusion.arXiv preprint arXiv:2508.08241, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[3] [3]

Z. Luo, Y . Y uan, T . W ang, C. Li, S. Chen, F . Casta˜neda, Z.-A. Cao, J. Li, D. Minor, Q. Ben, X. Da, R. Ding, C. Hogg, L. Song, E. Lim, E. Jeong, T . He, H. Xue, W . Xiao, Z. W ang, S. Y uen, J. Kautz, Y . Chang, U. Iqbal, L. Fan, and Y . Zhu. Sonic: Supersizing motion tracking for natural humanoid whole-body control.arXiv preprint arXiv:2511.07820, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[4] [4]

Y . Ze, Z. Chen, J. P . Araujo, Z.-a. Cao, X. B. Peng, J. Wu, and K. Liu. T wist: T eleoperated whole-body imitation system. In J. Lim, S. Song, and H.-W . Park, editors,Proceedings of The 9th Conference on Robot Learning, volume 305 ofProceedings of Machine Learning Research, pages 2143–2154. PMLR, 27–30 Sep 2025. URLhttps://proceedings.mlr.press/v305/ze25a.html

2025

[5] [5]

T . He, Z. Luo, X. He, W . Xiao, C. Zhang, W . Zhang, K. Kitani, C. Liu, and G. Shi. Omnih2o: Universal and dexterous human-to-humanoid whole-body teleoperation and learning. InConference on Robot Learning, 2024. URLhttps://api.semanticscholar.org/CorpusID:270440515

2024

[6] [6]

Q. Ben, F . Jia, J. Zeng, J. Dong, D. Lin, and J. Pang. HOMIE: Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit. InProceedings of Robotics: Science and Systems, LosAngeles, CA, USA, June 2025. doi:10.15607/RSS.2025.XXI.070

work page doi:10.15607/rss.2025.xxi.070 2025

[7] [7]

T . He, W . Xiao, T . Lin, Z. Luo, Z. Xu, Z. Jiang, J. Kautz, C. Liu, G. Shi, X. W ang, L. J. Fan, and Y . Zhu. Hover: V ersatile neural whole-body controller for humanoid robots. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 9989–9996, 2025. doi:10.1109/ ICRA55743.2025.11128549

work page arXiv 2025

[8] [8]

A. D. Ames, S. Coogan, M. Egerstedt, G. Notomista, K. Sreenath, and P . T abuada. Control barrier functions: Theory and applications. In2019 18th European Control Conference (ECC), 2019. doi:10.23919/ECC.2019.8796030

work page doi:10.23919/ecc.2019.8796030 2019

[9] [9]

S.-C. Hsu, X. Xu, and A. D. Ames. Control barrier function based quadratic programs with application to bipedal robotic walking. In2015 American Control Conference (ACC), pages 4542–4548, 2015. doi:10.1109/ACC.2015.7172044

work page doi:10.1109/acc.2015.7172044 2015

[10] [10]

Nguyen, A

Q. Nguyen, A. Hereid, J. W . Grizzle, A. D. Ames, and K. Sreenath. 3d dynamic walking on stepping stones with control barrier functions. In2016 IEEE 55th Conference on Decision and Control (CDC), pages 827–834, 2016. doi:10.1109/CDC.2016.7798370

work page doi:10.1109/cdc.2016.7798370 2016

[11] [11]

In: 2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids), pp

C. Khazoom, D. Gonzalez-Diaz, Y . Ding, and S. Kim. Humanoid self-collision avoidance using whole-body control with control barrier functions. In2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids), pages 558–565, 2022. doi:10.1109/Humanoids53995.2022. 10000235

work page doi:10.1109/humanoids53995.2022 2022

[12] [12]

V . C. Paredes and A. Hereid. Safe whole-body task space control for humanoid robots. In2024 Amer- ican Control Conference (ACC), pages 949–956, 2024. doi:10.23919/ACC60939.2024.10644227. 9

work page doi:10.23919/acc60939.2024.10644227 2024

[13] [13]

AgiBot World Colosseo: A large-scale manipulation platform for scalable and intelligent embodied systems,

L. Y ang, B. W erner, R. K. Cosner, D. Fridovich-Keil, P . Culbertson, and A. D. Ames. Shield: Safety on humanoids via cbfs in expectation on learned dynamics. In2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 203–210, 2025. doi:10.1109/IROS60139. 2025.11247065

work page doi:10.1109/iros60139 2025

[14] [14]

CBF-RL: Safety Filtering Reinforcement Learning in Training with Control Barrier Functions

L. Y ang, B. W erner, M. de Sa, and A. D. Ames. Cbf-rl: Safety filtering reinforcement learning in training with control barrier functions.arXiv preprint arXiv:2510.14959, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[15] [15]

Park and O

J. Park and O. Khatib. Contact consistent control framework for humanoid robots. InProceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006., pages 1963– 1969, 2006. doi:10.1109/ROBOT .2006.1641993

work page doi:10.1109/robot 2006

[16] [16]

Khatib, M

O. Khatib, M. Jorda, J. Park, L. Sentis, and S.-Y . Chung. Constraint-consistent task-oriented whole- body robot formulation: T ask, posture, constraints, multiple contacts, and balance.The International Journal of Robotics Research, 41(13-14):1079–1098, 2022. doi:10.1177/02783649221120029. URL https://doi.org/10.1177/02783649221120029

work page doi:10.1177/02783649221120029 2022

[17] [17]

Sentis.Synthesis and Control of Whole-Body Behaviors in Humanoid Systems

L. Sentis.Synthesis and Control of Whole-Body Behaviors in Humanoid Systems. Phd thesis, Stanford University, Stanford, CA, July 2007

2007

[18] [18]

Kuindersma, R

S. Kuindersma, R. Deits, M. Fallon, A. V alenzuela, H. Dai, F . Permenter, T . Koolen, P . Marion, and R. Tedrake. Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot.Autonomous robots, 40(3):429–455, 2016

2016

[19] [19]

P . M. W ensing, M. Posa, Y . Hu, A. Escande, N. Mansard, and A. D. Prete. Optimization-based control for dynamic legged robots.IEEE Transactions on Robotics, 40:43–63, 2024. doi:10.1109/ TRO.2023.3324580

work page arXiv 2024

[20] [20]

J. P . Araujo, Y . Ze, P . Xu, J. Wu, and C. K. Liu. Retargeting matters: General motion retargeting for humanoid motion tracking.ArXiv, abs/2510.02252, 2025. URLhttps://api.semanticscholar. org/CorpusID:281724926

work page arXiv 2025

[21] [21]

OmniRetarget: Interaction-Preserving Data Generation for Humanoid Whole-Body Loco-Manipulation and Scene Interaction

L. Y ang, X. Huang, Z. Wu, A. Kanazawa, P . Abbeel, C. Sferrazza, C. K. Liu, R. Duan, and G. Shi. Omniretarget: Interaction-preserving data generation for humanoid whole-body loco-manipulation and scene interaction.arXiv preprint arXiv:2509.26633, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[22] [22]

C. M. Kim*, B. Yi*, H. Choi, Y . Ma, K. Goldberg, and A. Kanazawa. Pyroki: A modular toolkit for robot kinematic optimization. In2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025. URLhttps://arxiv.org/abs/2505.03728

work page arXiv 2025

[23] [23]

K. Zakka. Mink: Python inverse kinematics based on MuJoCo, Feb. 2026. URLhttps://github. com/kevinzakka/mink

2026

[24] [24]

Q. Lu, Y . Feng, B. Shi, M. Piseno, Z. Bao, and C. K. Liu. Gentlehumanoid: Learning upper-body compliance for contact-rich human and object interaction.arXiv preprint arXiv:2511.04679, 2025

work page arXiv 2025

[25] [25]

Chen, Z.-A

S. Chen, Z.-A. Cao, Z. Luo, F . Casta˜neda, C. Li, T . W ang, Y . Y uan, L. Fan, C. K. Liu, and Y . Zhu. Chip: Learning adaptive compliance for humanoid control through hindsight perturbation.arXiv preprint arXiv:2512.14689, 2025

work page arXiv 2025

[26] [26]

G. B. Margolis, M. W ang, N. Fey, and P . Agrawal. SoftMimic: Learning compliant whole-body control from examples.arXiv preprint arXiv:2510.17792, 2025

work page arXiv 2025

[27] [27]

Y . Sun, Y . Pan, S. Li, C. Ding, T . Cui, L. W ang, and C. Liu. Learning safe-stoppability monitors for humanoid robots.arXiv preprint arXiv:2603.22703, 2026

work page arXiv 2026

[28] [28]

Z. Meng, T . Liu, L. Ma, Y . Wu, R. Song, W . Zhang, and S. Huang. Safefall: Learning protective control for humanoid robots.arXiv preprint arXiv:2511.18509, 2026. 10

work page arXiv 2026

[29] [29]

Strauch, D

P . Strauch, D. M¨uller, S. Christen, A. Serifi, R. Grandia, E. Knoop, and M. B¨acher. Robot crash course: Learning soft and stylized falling.arXiv preprint arXiv:2511.10635, 2025

work page arXiv 2025

[30] [30]

Y . Sun, R. Chen, K. S. Y un, Y . Fang, S. Jung, F . Li, B. Li, W . Zhao, and C. Liu. SP ARK: Safe protective and assistive robot kit. InIF AC Symposium on Robotics, 2025. URL https: //intelligent-control-lab.github.io/spark/

2025

[31] [31]

H. Xue, S. Liang, Z. Zhang, Z. Zeng, Y . Liu, Y . Lian, J. W ang, Q. Liu, X. Shi, and L. Yi. Collision-free humanoid traversal in cluttered indoor scenes, 2026. URLhttps://arxiv.org/abs/2601.16035

work page arXiv 2026

[32] [32]

R. Chen, Y . Sun, and C. Liu. Dexterous safe control for humanoids in cluttered environments via projected safe set algorithm.arXiv preprint arXiv:2502.02858, 2025

work page arXiv 2025

[33] [33]

frax: Fast Robot Kinematics and Dynamics in JAX

D. Morton and M. Pavone. frax: Fast robot kinematics and dynamics in jax.arXiv preprint arXiv:2604.04310, 2026. ICRA 2026 W orkshop on Frontiers of Optimization for Robotics

work page internal anchor Pith review Pith/arXiv arXiv 2026

[34] [34]

Bimanual robot-assisted dressing: A spherical coordinate-based strategy for tight-fitting garments

D. Morton and M. Pavone. Safe, task-consistent manipulation with operational space control barrier functions. In2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 187–194, 2025. doi:10.1109/IROS60139.2025.11246389

work page doi:10.1109/iros60139.2025.11246389 2025

[35] [35]

Englsberger, A

J. Englsberger, A. W erner, C. Ott, B. Henze, M. A. Roa, G. Garofalo, R. Burger, A. Beyer, O. Eiberger, K. Schmid, and A. Albu-Sch¨affer. Overview of the torque-controlled humanoid robot toro. In2014 IEEE-RAS International Conference on Humanoid Robots, pages 916–923, 2014. doi:10.1109/ HUMANOIDS.2014.7041473

work page arXiv 2014

[36] [36]

Doppalapudi, B

W . Xiao and C. Belta. High-order control barrier functions.IEEE Transactions on Automatic Control, 67(7), 2022. doi:10.1109/T AC.2021.3105491

work page doi:10.1109/t 2022

[37] [37]

Agrawal and K

A. Agrawal and K. Sreenath. Discrete control barrier functions for safety-critical control of discrete systems with application to bipedal robot navigation. InProceedings of Robotics: Science and Systems, Cambridge, Massachusetts, July 2017. doi:10.15607/RSS.2017.XIII.073

work page doi:10.15607/rss.2017.xiii.073 2017

[38] [38]

D. R. Agrawal and D. Panagou. Safe control synthesis via input constrained control barrier functions. In2021 60th IEEE Conference on Decision and Control (CDC), pages 6113–6118, 2021. doi: 10.1109/CDC45484.2021.9682938

work page doi:10.1109/cdc45484.2021.9682938 2021

[39] [39]

Flayols, A

T . Flayols, A. Del Prete, P . W ensing, A. Mifsud, M. Benallegue, and O. Stasse. Experimental evalua- tion of simple estimators for humanoid robots. In2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids), pages 889–895, 2017. doi:10.1109/HUMANOIDS.2017.8246977

work page doi:10.1109/humanoids.2017.8246977 2017

[40] [40]

CoCo-InEKF: State Estimation with Learned Contact Covariances in Dynamic, Contact-Rich Scenarios

M. Baumgartner, D. M¨uller, A. Serifi, R. Grandia, E. Knoop, M. Gross, and M. B¨acher. Coco-inekf: State estimation with learned contact covariances in dynamic, contact-rich scenarios.arXiv preprint arXiv:2605.15122, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[41] [41]

PICO Immersive Pte. Ltd. PICO 4 Ultra: An All-New Mixed Reality Experience.https://www. picoxr.com/global/products/pico4-ultra, 2023

2023

[42] [42]

Z. Zhao, L. Y u, K. Jing, and N. Y ang. Xrobotoolkit: A cross-platform framework for robot teleoperation.2026 IEEE/SICE International Symposium on System Integration (SII), pages 15–20,

2026

[43] [43]

URLhttps://api.semanticscholar.org/CorpusID:280417135

[44] [44]

Arrizabalaga, K

J. Arrizabalaga, K. Tracy, and Z. Manchester. A differentiable interior-point method in single precision,

[45] [45]

URLhttps://arxiv.org/abs/2605.17913

work page internal anchor Pith review Pith/arXiv arXiv

[46] [46]

Tracy and Z

K. Tracy and Z. Manchester. On the differentiability of the primal-dual interior-point method.arXiv preprint arXiv:2406.11749v2, 2024. 11 Appendix A Background: Humanoid Kinematics and Dynamics 13 A.1 Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 A.2 Contact Kinematics . . . . . . . . . . . . . . . . . . . ....

work page arXiv 2024

[47] [47]

--xla cpu multi thread eigen=false intra op parallelismthreads=1

or the comparisons between torque-control and velocity-control CBFs with torque limits in [34]. • Model mismatch, including imperfect actuators, miscalibrated inertial values, or unreliable contact mode estimation, can reduce the performance of the CBF when deployed on hardware. C Additional Implementation Details C.1 Timing and Performance Desktop timing...

[48] [48]

W e use a PICO 4 Ultra [41], similar to [1], with a custom C++ ROS2 interface for the XRoboT oolkit SDK [42]

Record human reference data. W e use a PICO 4 Ultra [41], similar to [1], with a custom C++ ROS2 interface for the XRoboT oolkit SDK [42]. For teleoperation, high-frequency and smooth input data is critical to downstream performance, and this interface was designed to minimize latency and jitter.This custom software will also be made available on publication

[49] [49]

Adjust the desired orientations of the feet to be parallel with the floor, to better suit our planar contact model

[50] [50]

Compute the velocity and position of the feet and incorporate this into a simple contact estimation heuristic (described in Sec. C.2)

[51] [51]

Rescale the positional data to approximately reflect the size difference between the human and Unitree G1

[52] [52]

As previously mentioned, this assumes that mode 0 (no contact) is not considered

Update the heights of all bodies to put the lowest point on the feet at z= 0 . As previously mentioned, this assumes that mode 0 (no contact) is not considered. Constructing and solving the QP

[53] [53]

Compute the error dynamics for the frame correspondences between the (pre-processed) human data and the current robot state

[54] [54]

Compute the Jacobians for all frames on the robot body withfrax[33]

[55] [55]

Compute the CBF terms withcbfpy[34]

[56] [56]

P ost-processing (After the QP solve)

Construct the QP matrices and solve the problem withqpax[43, 44]. P ost-processing (After the QP solve)

[57] [57]

Integrate the optimal ˙qaccording to the constrained kinematics

[58] [58]

T o align the internal state of the solver with this initial pose, we iterate until convergence in a sequential quadratic programming (SQP) fashion

Apply an exponential moving average filter to the free-floating base velocities to ensure a smooth observation On initialization, the first human pose in the reference motion may be quite different from the default standing pose of the robot. T o align the internal state of the solver with this initial pose, we iterate until convergence in a sequential qu...