pith. sign in

arxiv: 2606.10501 · v1 · pith:2KKX7OXFnew · submitted 2026-06-09 · 💻 cs.RO

Uncovering Vulnerability of Vision-Language-Action Models under Joint-Level Physical Faults

Pith reviewed 2026-06-27 12:48 UTC · model grok-4.3

classification 💻 cs.RO
keywords vision-language-action modelsjoint-level physical faultsrobot embodiment mismatchclosed-loop executionresidual calibrationpolicy robustnessactuator degradationfriction faults
0
0 comments X

The pith

Vision-language-action models lose task success under joint-level physical faults even when motions remain feasible.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how physical changes at individual robot joints disrupt vision-language-action policies that map images and language to actions. It establishes that these faults produce joint-dependent drops in success rates and that the degradation stems from closed-loop execution mismatch rather than physical impossibility alone. The authors introduce a lightweight residual calibration method that infers the current fault regime from joint dynamics and applies corrective adjustments while leaving the original policy frozen. This matters for anyone planning to run such models on real hardware where wear, friction, and actuator issues arise over time. The work isolates the embodiment-side vulnerability as a distinct robustness problem separate from perceptual or semantic variations.

Core claim

VLA models are vulnerable when predicted actions are executed through a perturbed robot body. Our analysis reveals joint-dependent effects, with heterogeneous degradation in task success across affected joints. We also show that performance drops cannot be attributed solely to physical infeasibility, since feasible faults such as increased joint friction can still substantially reduce success rates and induce closed-loop execution mismatch. Motivated by these findings, we propose Joint-level Physical-fault Aware Residual Calibrator (J-PARC), a lightweight residual calibration framework built on top of a frozen VLA policy. J-PARC infers a latent joint-fault regime from recent joint dynamics a

What carries the argument

Joint-level Physical-fault Aware Residual Calibrator (J-PARC), which infers a latent fault regime from recent joint dynamics and applies regime-conditioned residual corrections to the actions of a frozen VLA policy.

If this is right

  • Task success degrades heterogeneously depending on which joint experiences the fault.
  • Feasible physical changes such as increased friction still induce substantial closed-loop mismatch and lower success rates.
  • A frozen VLA policy can be augmented with a lightweight, fault-regime-aware residual calibrator to recover robustness.
  • The calibrator leaves performance unchanged in fault-free conditions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Real deployments may require continuous monitoring of joint dynamics to trigger the right calibration regime.
  • The same residual-calibration pattern could be tested on other action-generation methods such as diffusion policies or reinforcement-learning controllers.
  • Combining multiple simultaneous joint faults would test whether the latent-regime inference remains effective under compound degradation.

Load-bearing premise

The listed joint-level faults produce a closed-loop execution mismatch that is representative of real robots and that the experimental setup isolates this mismatch from other factors.

What would settle it

A physical-robot experiment in which the same VLA policy shows no drop in task success under the described joint faults, or in which J-PARC yields no measurable improvement over the base policy under those faults.

Figures

Figures reproduced from arXiv: 2606.10501 by Junha Chun, Minsoo Jo, Taeju Kwon, Taesup Kim, Youngjoon Jeong.

Figure 1
Figure 1. Figure 1: VLA models under joint-level physical faults. A VLA policy successfully completes the LIBERO [18] task in the fault-free setting, but fails when different Franka Panda joints are locked. This illustrates how embodiment-side faults can change the robot’s realized motion without changing the policy output, motivating physical-fault aware action calibration. joint friction [15, 16, 17]. We treat range limits … view at source ↗
Figure 2
Figure 2. Figure 2: Heterogeneous and feasible joint-level fault effects. Joint-level faults cause heterogeneous performance degradation across joints, and increased friction can reduce success rates even when the task remains physically feasible. 2 VLA Models are Vulnerable to Joint-Level Physical faults We first investigate the vulnerability of VLA policies to joint-level physical faults during closed￾loop robot execution. … view at source ↗
Figure 3
Figure 3. Figure 3: Fault-accumulation recovery. Success drops as faults persist before release. To examine whether VLA policies can recover from states induced by persistent joint faults, we evaluate a fault-accumulation setting. The robot first executes under a locked-joint fault for a specified number of steps, allowing the fault-induced deviation to accumulate in the robot state. Then, starting from this accumulated state… view at source ↗
Figure 4
Figure 4. Figure 4: UMAP visualiza￾tion of robot state transi￾tion distributions under dif￾ferent fault conditions. Joint π0.5 CIK ∆ j0 57.4 54.7 -2.6 j1 0.0 0.0 +0.0 j2 37.5 33.3 -4.2 j3 6.3 5.4 -0.9 j4 85.2 83.5 -1.7 j5 86.9 82.2 -4.8 j6 66.7 68.8 +2.1 Mean 48.6 46.8 -1.7 [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: T-SNE visualization of latent representations learned by the joint-fault regime encoder. We visualize encoder embed￾dings under locked-joint faults and increased-friction faults. The embeddings form joint-dependent clusters under both fault types. To evaluate whether the joint￾fault regime encoder captures meaningful execution contexts, we visualize its latent represen￾tations under different joint-level f… view at source ↗
Figure 6
Figure 6. Figure 6: Real-world evaluation on the Trossen WidowX AI bowl pick-and-place task. Trajectory overlays are drawn on the final observation frame. Under joint-level faults, the base policy without J-PARC often drifts away from the fault-free execution path, while J-PARC redirects the end-effector trajectory toward the successful placement behavior. Action-Space Robustness and Physical Execution Mismatch. Beyond visual… view at source ↗
read the original abstract

Deploying Vision-Language-Action (VLA) models in real robotic systems requires robustness not only to semantic and perceptual variations, but also to embodiment-side faults that change how actions are physically realized. Real robots can experience joint-level changes caused by actuator degradation, hardware faults, safety limits, collision damage, or wear-induced friction. These faults are critical because they alter the action-to-motion interface of a policy, disrupting the learned closed-loop relationship between commanded actions, realized motion, and subsequent observations. In this work, we study realistic joint-level physical faults and show that VLA models are vulnerable when predicted actions are executed through a perturbed robot body. Our analysis reveals joint-dependent effects, with heterogeneous degradation in task success across affected joints. We also show that performance drops cannot be attributed solely to physical infeasibility, since feasible faults such as increased joint friction can still substantially reduce success rates and induce closed-loop execution mismatch. Motivated by these findings, we propose Joint-level Physical-fault Aware Residual Calibrator (J-PARC), a lightweight residual calibration framework built on top of a frozen VLA policy. J-PARC infers a latent joint-fault regime from recent joint dynamics and conditions a shared residual calibrator on this regime, enabling adaptive action correction across faulty joints. Experiments show that J-PARC improves robustness under joint-level faults while preserving fault-free environment performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that Vision-Language-Action (VLA) models are vulnerable to joint-level physical faults (actuator degradation, hardware faults, safety limits, collision damage, wear-induced friction) that alter the action-to-motion interface, causing heterogeneous degradation in task success rates across affected joints. It asserts that these drops cannot be attributed solely to physical infeasibility, as feasible faults like increased friction still induce closed-loop execution mismatch. Motivated by this, the authors propose J-PARC, a lightweight residual calibration framework on a frozen VLA policy that infers a latent joint-fault regime from recent joint dynamics and conditions a shared residual calibrator for adaptive action correction, with experiments showing improved robustness under faults while preserving fault-free performance.

Significance. If the experimental isolation of policy-specific mismatch holds, the work is significant for real-world VLA deployment in robotics, as it shifts focus from perceptual/semantic robustness to embodiment faults that disrupt the learned closed-loop relationship. The joint-dependent heterogeneity analysis and the practical J-PARC mitigation (which avoids retraining the base policy) could guide more reliable robotic systems. The emphasis on feasible faults provides a realistic lens on policy sensitivity beyond obvious kinematic violations.

major comments (2)
  1. [Abstract and Experiments] Abstract and Experiments section: The central claim that 'performance drops cannot be attributed solely to physical infeasibility' and that 'feasible faults such as increased joint friction can still substantially reduce success rates' is load-bearing for distinguishing closed-loop mismatch from generic execution failure. However, no fault-model equations (e.g., definitions of friction coefficients or actuator degradation), kinematic/dynamic feasibility metrics, or baselines (e.g., oracle controller under identical perturbations) are supplied. This prevents verification that the observed heterogeneous success degradation is policy-specific rather than confounded by altered observation distributions or velocity limits.
  2. [Proposed Method (J-PARC)] J-PARC description (proposed method section): The framework 'infers a latent joint-fault regime from recent joint dynamics and conditions a shared residual calibrator on this regime' is presented as the mitigation, but lacks any equations for regime inference, the conditioning mechanism, loss functions, or training details on how the calibrator is learned. Without these, it is impossible to assess whether J-PARC actually addresses the identified mismatch or merely fits to the experimental perturbations.
minor comments (2)
  1. [Abstract] The abstract supplies no quantitative details on datasets, tasks, number of trials, or success-rate tables, which reduces clarity even for a high-level overview.
  2. [Introduction and Method] Notation for 'joint dynamics' and 'latent joint-fault regime' is introduced without prior definition or reference to standard robotics terminology (e.g., joint velocity/position histories), which could be clarified for readers outside the immediate subfield.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback, which highlights important areas for strengthening the presentation of our claims and the J-PARC method. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract and Experiments] Abstract and Experiments section: The central claim that 'performance drops cannot be attributed solely to physical infeasibility' and that 'feasible faults such as increased joint friction can still substantially reduce success rates' is load-bearing for distinguishing closed-loop mismatch from generic execution failure. However, no fault-model equations (e.g., definitions of friction coefficients or actuator degradation), kinematic/dynamic feasibility metrics, or baselines (e.g., oracle controller under identical perturbations) are supplied. This prevents verification that the observed heterogeneous success degradation is policy-specific rather than confounded by altered observation distributions or velocity limits.

    Authors: We agree that the central claim requires explicit supporting details to isolate policy-specific closed-loop mismatch. In the revised manuscript, we will add the fault-model equations (including definitions of friction coefficients and actuator degradation), kinematic/dynamic feasibility metrics confirming the faults remain feasible, and an oracle controller baseline under identical perturbations. These additions will enable verification that the heterogeneous degradation is attributable to the VLA policy rather than confounders such as altered observations or velocity limits. revision: yes

  2. Referee: [Proposed Method (J-PARC)] J-PARC description (proposed method section): The framework 'infers a latent joint-fault regime from recent joint dynamics and conditions a shared residual calibrator on this regime' is presented as the mitigation, but lacks any equations for regime inference, the conditioning mechanism, loss functions, or training details on how the calibrator is learned. Without these, it is impossible to assess whether J-PARC actually addresses the identified mismatch or merely fits to the experimental perturbations.

    Authors: We acknowledge that the J-PARC description is currently high-level and omits the requested mathematical details. In the revised manuscript, we will provide the equations for latent joint-fault regime inference from recent joint dynamics, the conditioning mechanism for the shared residual calibrator, the loss functions for training the calibrator, and the full training procedure with hyperparameters. These will allow readers to evaluate how J-PARC specifically mitigates the identified mismatch. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical claims rest on experiments, not self-referential derivations

full rationale

The paper is an empirical study of VLA robustness under joint faults, with claims about heterogeneous degradation and non-infeasibility causes supported by experimental results rather than any derivation chain. No equations, fitted parameters renamed as predictions, self-citations used as uniqueness theorems, or ansatzes smuggled via prior work are present in the provided text. The proposed J-PARC framework is described at a high level without reducing to its own inputs by construction. This matches the default expectation for non-circular papers; the reader's assessment of score 1.0 is consistent with absence of any load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Abstract-only review yields minimal identifiable free parameters or invented entities beyond the proposed framework itself; domain assumptions about fault realism are stated but unverified.

axioms (1)
  • domain assumption Joint-level physical faults alter the action-to-motion interface and disrupt the learned closed-loop relationship between commanded actions, realized motion, and observations.
    Invoked in the abstract as the core reason faults are critical for VLA policies.
invented entities (1)
  • J-PARC (Joint-level Physical-fault Aware Residual Calibrator) no independent evidence
    purpose: Lightweight residual calibration framework that infers latent joint-fault regime and conditions a shared residual calibrator.
    Introduced in the abstract as the proposed solution; no independent evidence of its effectiveness is supplied beyond the abstract claim.

pith-pipeline@v0.9.1-grok · 5784 in / 1313 out tokens · 25994 ms · 2026-06-27T12:48:32.616817+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 12 canonical work pages

  1. [1]

    Black, N

    K. Black, N. Brown, J. Darpinian, K. Dhabalia, D. Driess, A. Esmail, M. R. Equi, C. Finn, N. Fusai, M. Y . Galliker, D. Ghosh, L. Groom, K. Hausman, b. ichter, S. Jakubczak, T. Jones, L. Ke, D. LeBlanc, S. Levine, A. Li-Bell, M. Mothukuri, S. Nair, K. Pertsch, A. Z. Ren, L. X. Shi, L. Smith, J. T. Springenberg, K. Stachowicz, J. Tanner, Q. Vuong, H. Walke...

  2. [2]

    Black, N

    K. Black, N. Brown, D. Driess, A. Esmail, M. Equi, C. Finn, N. Fusai, L. Groom, K. Hausman, B. Ichter, S. Jakubczak, T. Jones, L. Ke, S. Levine, A. Li-Bell, M. Mothukuri, S. Nair, K. Pertsch, L. X. Shi, J. Tanner, Q. Vuong, A. Walling, H. Wang, and U. Zhilinsky.π0: A vision-language- action flow model for general robot control, 2026. URL https://arxiv.org...

  3. [3]

    Q. Bu, Y . Yang, J. Cai, S. Gao, G. Ren, M. Yao, P. Luo, and H. Li. Univla: Learning to act anywhere with task-centric latent actions.arXiv preprint arXiv:2505.06111, 2025

  4. [4]

    M. Kim, K. Pertsch, S. Karamcheti, T. Xiao, A. Balakrishna, S. Nair, R. Rafailov, E. Foster, G. Lam, P. Sanketi, Q. Vuong, T. Kollar, B. Burchfiel, R. Tedrake, D. Sadigh, S. Levine, P. Liang, and C. Finn. Openvla: An open-source vision-language-action model.arXiv preprint arXiv:2406.09246, 2024

  5. [5]

    M. J. Kim, C. Finn, and P. Liang. Fine-tuning vision-language-action models: Optimizing speed and success, 2025. URLhttps://arxiv.org/abs/2502.19645

  6. [6]

    J. Guan, T. Ding, L. Cao, L. Pan, C. Wang, and X. Zheng. Probing the robustness of vision- language pretrained models: A multimodal adversarial attack approach, 2024. URL https: //arxiv.org/abs/2408.13461

  7. [7]

    X. Lu, J. Chen, S. Xiao, Z. Jin, R. Zhou, X. Ji, and W. Xu. Exploring the robustness of vision-language-action models against sensor attacks. InProceedings of the 2025 Workshop on Large AI Systems and Models with Privacy and Security Analysis, LAMPS ’25, page 11–18, New York, NY , USA, 2025. Association for Computing Machinery. ISBN 9798400718960. doi:10....

  8. [8]

    S. Fei, S. Wang, J. Shi, Z. Dai, J. Cai, P. Qian, L. Ji, X. He, S. Zhang, Z. Fei, J. Fu, J. Gong, and X. Qiu. Libero-plus: In-depth robustness analysis of vision-language-action models.arXiv preprint arXiv:2510.13626, 2025

  9. [9]

    T.-H. Pham, G. Aikins, T. Truong, and K.-D. Nguyen. Adaptive compensation for robotic joint failures using partially observable reinforcement learning, 2024. URL https://arxiv.org/ abs/2409.14435

  10. [10]

    T. Hou, J. Tu, X. Gao, Z. Dong, P. Zhai, and L. Zhang. Multi-task learning of active fault-tolerant controller for leg failures in quadruped robots, 2024. URL https://arxiv.org/abs/2402. 08996

  11. [11]

    G. G. Briscoe-Martinez, Y . Gautam, R. Shetty, A. Pasricha, M. M. Nicotra, and A. Roncone. Moving On, Even When You’re Broken: Fail-Active Trajectory Generation via Diffusion Policies Conditioned on Embodiment and Task.arXiv e-prints, art. arXiv:2602.02895, Feb

  12. [12]

    doi:10.48550/arXiv.2602.02895

  13. [13]

    I. Eski, S. Erkaya, S. Savas, and S. Yildirim. Fault detection on robot manipulators using artificial neural networks.Robotics and Computer-Integrated Manufacturing, 27(1):115–123,

  14. [14]

    doi:https://doi.org/10.1016/j.rcim.2010.06.017

    ISSN 0736-5845. doi:https://doi.org/10.1016/j.rcim.2010.06.017. URL https://www. sciencedirect.com/science/article/pii/S0736584510000682. 9

  15. [15]

    M. Goel, A. Maciejewski, and V . Balakrishnan. Analyzing unidentified locked-joint failures in kinematically redundant manipulators.J. Field Robotics, 22:15–29, 01 2005. doi:10.1002/rob. 20046

  16. [16]

    Tinós and M

    R. Tinós and M. H. Terra. Free-swinging and locked joint fault detection and isolation in cooperative manipulators. InThe European Symposium on Artificial Neural Networks, 2002. URLhttps://api.semanticscholar.org/CorpusID:13917572

  17. [17]

    X. Liu, H. Li, J. Wang, and G. Cai. Dynamics analysis of flexible space robot with joint friction.Aerospace Science and Technology, 47:164–176, 2015. ISSN 1270-9638. doi:https: //doi.org/10.1016/j.ast.2015.09.030. URL https://www.sciencedirect.com/science/ article/pii/S1270963815002977

  18. [18]

    L. Hao, R. Pagani, M. Beschi, and G. Legnani. Dynamic and friction parameters of an industrial robot: Identification, comparison and repetitiveness analysis.Robotics, 10(1), 2021. ISSN 2218-

  19. [19]

    URL https://www.mdpi.com/2218-6581/10/1/49

    doi:10.3390/robotics10010049. URL https://www.mdpi.com/2218-6581/10/1/49

  20. [20]

    Bittencourt.Modeling and Diagnosis of Friction and Wear in Industrial Robots

    A. Bittencourt.Modeling and Diagnosis of Friction and Wear in Industrial Robots. 09 2014. ISBN 9789175192512. doi:10.3384/diss.diva-109335

  21. [21]

    B. Liu, Y . Zhu, C. Gao, Y . Feng, Q. Liu, Y . Zhu, and P. Stone. Libero: benchmarking knowledge transfer for lifelong robot learning. InProceedings of the 37th International Conference on Neural Information Processing Systems, NIPS ’23, Red Hook, NY , USA, 2023. Curran Associates Inc

  22. [22]

    Levine, A

    S. Levine, A. Kumar, G. Tucker, and J. Fu. Offline reinforcement learning: Tutorial, review, and perspectives on open problems, 2020. URLhttps://arxiv.org/abs/2005.01643

  23. [23]

    J. Guo, Z. Wu, C. Tu, Y . Ma, X. Kong, Z. Liu, J. Ji, S. Zhang, Y . Chen, K. Chen, Q. Dou, Y . Yang, X. Liu, H. Zhao, W. Lv, and S. Li. On robustness of vision-language-action model against multi-modal perturbations. InThe Fourteenth International Conference on Learning Representations, 2026. URLhttps://openreview.net/forum?id=cS6xizdYD5

  24. [24]

    Dariush, Y

    B. Dariush, Y . Zhu, A. Arumbakkam, and K. Fujimura. Constrained closed loop inverse kinematics. In2010 IEEE International Conference on Robotics and Automation, pages 2499–2506, 2010. doi:10.1109/ROBOT.2010.5509456

  25. [25]

    Welman.Inverse Kinematics and Geometric Constraints for Articulated Figure Manipulation [microform]

    C. Welman.Inverse Kinematics and Geometric Constraints for Articulated Figure Manipulation [microform]. Canadian theses on microfiche. Thesis (M.Sc.)–Simon Fraser University, 1993. ISBN 9780315912564. URLhttps://books.google.co.kr/books?id=PqbDAAAACAAJ

  26. [26]

    Brohan, N

    A. Brohan, N. Brown, J. Carbajal, Y . Chebotar, J. Dabis, C. Finn, K. Gopalakrishnan, K. Haus- man, A. Herzog, J. Hsu, J. Ibarz, B. Ichter, A. Irpan, T. Jackson, S. Jesmonth, N. Joshi, R. Julian, D. Kalashnikov, Y . Kuang, I. Leal, K.-H. Lee, S. Levine, Y . Lu, U. Malla, D. Manjunath, I. Mordatch, O. Nachum, C. Parada, J. Peralta, E. Perez, K. Pertsch, J....

  27. [27]

    O. X.-E. Collaboration, A. O’Neill, A. Rehman, A. Gupta, A. Maddukuri, A. Gupta, A. Padalkar, A. Lee, A. Pooley, A. Gupta, A. Mandlekar, A. Jain, A. Tung, A. Bewley, A. Herzog, A. Irpan, A. Khazatsky, A. Rai, A. Gupta, A. Wang, A. Kolobov, A. Singh, A. Garg, A. Kembhavi, A. Xie, A. Brohan, A. Raffin, A. Sharma, A. Yavary, A. Jain, A. Balakrishna, A. Wahid...

  28. [28]

    Ghosh, H

    Octo Model Team, D. Ghosh, H. Walke, K. Pertsch, K. Black, O. Mees, S. Dasari, J. Hejna, C. Xu, J. Luo, T. Kreiman, Y . Tan, L. Y . Chen, P. Sanketi, Q. Vuong, T. Xiao, D. Sadigh, C. Finn, and S. Levine. Octo: An open-source generalist robot policy. InProceedings of Robotics: Science and Systems, Delft, Netherlands, 2024

  29. [29]

    Zitkovich, T

    B. Zitkovich, T. Yu, S. Xu, P. Xu, T. Xiao, F. Xia, J. Wu, P. Wohlhart, S. Welker, A. Wahid, Q. Vuong, V . Vanhoucke, H. Tran, R. Soricut, A. Singh, J. Singh, P. Sermanet, P. R. Sanketi, G. Salazar, M. S. Ryoo, K. Reymann, K. Rao, K. Pertsch, I. Mordatch, H. Michalewski, Y . Lu, S. Levine, L. Lee, T.-W. E. Lee, I. Leal, Y . Kuang, D. Kalashnikov, R. Julia...

  30. [30]

    Y . Ma, Z. Song, Y . Zhuang, J. Hao, and I. King. A survey on vision-language-action models for embodied AI.CoRR, abs/2405.14093, 2024

  31. [31]

    Bjorck, F

    NVIDIA, :, J. Bjorck, F. Castañeda, N. Cherniadev, X. Da, R. Ding, L. J. Fan, Y . Fang, D. Fox, F. Hu, S. Huang, J. Jang, Z. Jiang, J. Kautz, K. Kundalia, L. Lao, Z. Li, Z. Lin, K. Lin, G. Liu, E. Llontop, L. Magne, A. Mandlekar, A. Narayan, S. Nasiriany, S. Reed, Y . L. Tan, G. Wang, Z. Wang, J. Wang, Q. Wang, J. Xiang, Y . Xie, Y . Xu, Z. Xu, S. Ye, Z. ...

  32. [32]

    Pertsch, K

    K. Pertsch, K. Stachowicz, B. Ichter, D. Driess, S. Nair, Q. Vuong, O. Mees, C. Finn, and S. Levine. Fast: Efficient action tokenization for vision-language-action models, 2025. URL https://arxiv.org/abs/2501.09747

  33. [33]

    Shukor, D

    M. Shukor, D. Aubakirova, F. Capuano, P. Kooijmans, S. Palma, A. Zouitine, M. Aractingi, C. Pascal, M. Russi, A. Marafioti, S. Alibert, M. Cord, T. Wolf, and R. Cadene. Smolvla: A vision-language-action model for affordable and efficient robotics, 2025. URL https: //arxiv.org/abs/2506.01844

  34. [34]

    Zhong, F

    Y . Zhong, F. Bai, S. Cai, X. Huang, Z. Chen, X. Zhang, Y . Wang, S. Guo, T. Guan, K. N. Lui, Z. Qi, Y . Liang, Y . Chen, and Y . Yang. A survey on vision-language-action models: An action tokenization perspective, 2025. URLhttps://arxiv.org/abs/2507.01925

  35. [35]

    Z. Wang, Z. Zhou, J. Song, Y . Huang, Z. Shu, and L. Ma. Vlatest: Testing and evaluating vision-language-action models for robotic manipulation.Proc. ACM Softw. Eng., 2(FSE), June

  36. [36]

    URLhttps://doi.org/10.1145/3729343

    doi:10.1145/3729343. URLhttps://doi.org/10.1145/3729343

  37. [37]

    Zhang, S

    H. Zhang, S. Zhang, J. Jin, Q. Zeng, R. Li, and D. Wang. Robustvla: Robustness-aware reinforcement post-training for vision-language-action models, 2025. URL https://arxiv. org/abs/2511.01331

  38. [38]

    T. Wang, C. Han, J. Liang, W. Yang, D. Liu, L. X. Zhang, Q. Wang, J. Luo, and R. Tang. Exploring the adversarial vulnerabilities of vision-language-action models in robotics. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 6948–6958, October 2025

  39. [39]

    H. Liu, S. Ruan, J. Long, J. Wu, J. Hou, H. Tang, T. Jiang, W. Zhou, and W. Yao. Eva-vla: Evaluating vision-language-action models’ robustness under real-world physical variations,

  40. [40]

    URLhttps://arxiv.org/abs/2509.18953

  41. [41]

    Y . Yan, Y . Xie, Y . Zhang, L. Lyu, H. Wang, and Y . Jin. When alignment fails: Multimodal adversarial attacks on vision-language-action models, 2025. URL https://arxiv.org/abs/ 2511.16203

  42. [42]

    X. Zhou, G. Tie, G. Zhang, H. Wang, P. Zhou, and L. Sun. BadVLA: Towards backdoor attacks on vision-language-action models via objective-decoupled optimization. InThe Thirty- ninth Annual Conference on Neural Information Processing Systems, 2025. URL https: //openreview.net/forum?id=rEhVHla9zp

  43. [43]

    Jamisola, A

    R. Jamisola, A. Maciejewski, and R. Roberts. Failure-tolerant path planning for kinematically redundant manipulators anticipating locked-joint failures.IEEE Transactions on Robotics, 22 (4):603–612, 2006. doi:10.1109/TRO.2006.878959

  44. [44]

    Lewis and A

    C. Lewis and A. Maciejewski. Fault tolerant operation of kinematically redundant manipulators for locked joint failures.IEEE Transactions on Robotics and Automation, 13(4):622–629, 1997. doi:10.1109/70.611335

  45. [45]

    Visinsky, J

    M. Visinsky, J. Cavallaro, and I. Walker. Robotic fault detection and fault tolerance: A survey.Reliability Engineering & System Safety, 46(2):139–158, 1994. ISSN 0951-8320. doi: https://doi.org/10.1016/0951-8320(94)90132-5. URL https://www.sciencedirect.com/ science/article/pii/0951832094901325. 12 A Implementation and Evaluation Details A.1 Implementati...