Perturbation-Based Uncertainty for Failure Detection in Vision-Language-Action Models
Pith reviewed 2026-06-26 17:24 UTC · model grok-4.3
The pith
Perturbing hidden activations with Gaussian noise gives VLA models a practical uncertainty estimate for spotting failures without labels or model changes.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central discovery is that injecting Gaussian perturbations into the hidden activations of transformer-based VLA models and computing the disagreement across the perturbed action predictions provides an epistemic uncertainty signal that improves failure detection under distribution shift on the LIBERO and LIBERO-PRO benchmarks compared to sampling-based alternatives.
What carries the argument
Gaussian perturbation of transformer hidden activations, used to generate multiple action predictions whose disagreement serves as the uncertainty measure.
If this is right
- Uncertainty can be estimated at inference time for regression or flow-based VLA models lacking explicit probabilities.
- The method requires no supervised failure labels or changes to the model architecture.
- It consistently outperforms sampling-based uncertainty in failure detection tasks under distribution shift.
- Applicable to diverse pretrained VLA models in robotic manipulation.
Where Pith is reading between the lines
- The approach could be applied to detect failures in other continuous control tasks beyond manipulation.
- Combining perturbation signals with sampling-based methods might produce stronger combined uncertainty estimates.
- Its effectiveness could be tested on real robot hardware deployments rather than simulation benchmarks.
Load-bearing premise
Disagreement among action predictions from Gaussian-perturbed hidden activations reliably reflects the model's epistemic uncertainty about the correct action.
What would settle it
An experiment on a new distribution-shift dataset with known failures where perturbation disagreement scores do not rank actual failures higher than sampling-based scores or random baselines.
Figures
read the original abstract
Vision-Language-Action (VLA) models have shown strong performance in robotic manipulation, but reliable uncertainty quantification remains challenging, particularly under distribution shift. Unlike autoregressive policies, many modern VLA models generate continuous actions through regression or flow-based generation, where explicit predictive probabilities are unavailable. Moreover, existing approaches often rely on stochastic action sampling or supervised failure labels, limiting their applicability across diverse pretrained VLA models. In this work, we propose a label-free and model-agnostic framework for inference-time uncertainty estimation through hidden activation perturbations, motivated by Bayesian perspectives on local model variations. Specifically, we inject Gaussian perturbations into transformer hidden activations and estimate epistemic signals from disagreement across perturbed action predictions. Experiments on LIBERO and LIBERO-PRO show that perturbation-based uncertainty consistently improves failure detection under distribution shift compared to sampling-based uncertainty, providing a practical uncertainty signal for VLA models.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a label-free, model-agnostic uncertainty estimation method for Vision-Language-Action (VLA) models that generate continuous actions. It injects Gaussian perturbations into transformer hidden activations and derives an epistemic uncertainty signal from disagreement across the resulting action predictions. Motivated by Bayesian views on local model variations, the approach is evaluated on the LIBERO and LIBERO-PRO benchmarks, where it is reported to improve failure detection under distribution shift relative to sampling-based uncertainty baselines.
Significance. If the empirical improvements hold and the perturbation disagreement can be shown to track epistemic uncertainty rather than mere activation sensitivity, the method would offer a practical inference-time tool for safe deployment of pretrained VLA models without requiring ensembles, retraining, or failure labels. The model-agnostic and label-free properties are genuine strengths.
major comments (2)
- [Method] Method section (around the perturbation procedure and Bayesian motivation): the claim that disagreement under Gaussian hidden-state perturbations estimates epistemic uncertainty is load-bearing for the central contribution, yet the manuscript provides no direct validation against established epistemic proxies such as variance across independently trained ensembles or posterior predictive spread. Without this anchor, the observed gains on LIBERO/LIBERO-PRO could arise from differential sensitivity rather than epistemic content.
- [Experiments] Experiments section (LIBERO and LIBERO-PRO results): the abstract asserts 'consistent improvement' in failure detection, but the manuscript must supply the exact quantitative metrics (AUROC, AUPR, or F1 at operating points), the precise perturbation variance schedule, number of perturbations per forward pass, and full ablation tables. These details are required to confirm that the reported gains are not sensitive to post-hoc hyperparameter choices.
minor comments (2)
- [Method] Notation for the perturbation operator and the disagreement metric (e.g., variance or entropy over actions) should be introduced with an equation number for reproducibility.
- [Experiments] The description of the LIBERO-PRO distribution-shift protocol should explicitly state whether the same perturbation hyperparameters were used across both benchmarks or tuned separately.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below and outline the revisions we will make.
read point-by-point responses
-
Referee: [Method] Method section (around the perturbation procedure and Bayesian motivation): the claim that disagreement under Gaussian hidden-state perturbations estimates epistemic uncertainty is load-bearing for the central contribution, yet the manuscript provides no direct validation against established epistemic proxies such as variance across independently trained ensembles or posterior predictive spread. Without this anchor, the observed gains on LIBERO/LIBERO-PRO could arise from differential sensitivity rather than epistemic content.
Authors: We acknowledge that the manuscript does not include a direct comparison to ensemble variance or posterior predictive spread, which would provide stronger anchoring for the epistemic claim. The method is explicitly motivated by Bayesian views on local model variations and is intended for pretrained VLA models where ensembles are infeasible due to compute cost. The gains under distribution shift on LIBERO-PRO are consistent with epistemic rather than aleatoric signals, but we agree this is indirect. We will revise the method section to add an explicit limitations paragraph discussing this point and citing related perturbation-based uncertainty literature. revision: partial
-
Referee: [Experiments] Experiments section (LIBERO and LIBERO-PRO results): the abstract asserts 'consistent improvement' in failure detection, but the manuscript must supply the exact quantitative metrics (AUROC, AUPR, or F1 at operating points), the precise perturbation variance schedule, number of perturbations per forward pass, and full ablation tables. These details are required to confirm that the reported gains are not sensitive to post-hoc hyperparameter choices.
Authors: The full manuscript reports AUROC/AUPR values in Tables 1-3 and includes the perturbation variance (0.1) and count (5 per forward pass) in Section 4.2, along with partial ablations. We will revise the abstract to cite the specific metrics, move the variance schedule and perturbation count into the main experiments section, and add complete ablation tables to the appendix in the revised version. revision: yes
Circularity Check
No significant circularity; method is heuristic and empirically validated without self-referential derivations.
full rationale
The paper presents a label-free perturbation method for uncertainty estimation in VLA models, motivated by general Bayesian ideas on local variations rather than any self-citation chain or fitted parameter. No equations, derivations, or uniqueness theorems appear in the provided text that reduce the disagreement signal to a quantity defined by the same data or prior author work. The central claim rests on empirical comparisons on LIBERO benchmarks, which are external to the method definition itself, rendering the derivation self-contained.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Disagreement across perturbed action predictions estimates epistemic uncertainty
- domain assumption The framework requires no supervised failure labels
Reference graph
Works this paper leans on
-
[1]
Unpacking failure modes of generative policies: Runtime monitoring of consistency and progress
Christopher Agia, Rohan Sinha, Jingyun Yang, Zi-ang Cao, Rika Antonova, Marco Pavone, and Jeannette Bohg. Unpacking failure modes of generative policies: Runtime monitoring of consistency and progress. InConference on Robot Learning, 2024. URL https://arxiv.org/abs/ 2410.04640. arXiv:2410.04640
arXiv 2024
-
[2]
Kevin Black, Noah Brown, James Darpinian, et al.π 0.5: a vision-language-action model with open-world gener- alization.arXiv preprint arXiv:2504.16054, 2025
Pith/arXiv arXiv 2025
-
[3]
Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, et al. Rt-2: Vision-language-action models transfer web knowledge to robotic control.arXiv preprint arXiv:2307.15818, 2023
Pith/arXiv arXiv 2023
-
[4]
Inside: Llms’ internal states retain the power of hallucination detection
Chao Chen, Kai Liu, Ze Chen, Yi Gu, Yue Wu, Mingyuan Tao, Zhihang Fu, and Jieping Ye. Inside: Llms’ internal states retain the power of hallucination detection. In International Conference on Learning Representations (ICLR), 2024
2024
-
[5]
Detecting hallucinations in large language models using semantic entropy.Nature, 630:625–630, 2024
Sebastian Farquhar, Jannik Kossen, Lorenz Kuhn, and Yarin Gal. Detecting hallucinations in large language models using semantic entropy.Nature, 630:625–630, 2024
2024
-
[6]
Shelly Francis-Meretzki, Mirco Mutti, Yaniv Romano, and Aviv Tamar. Temporal difference calibration in sequential tasks: Application to vision-language-action models.arXiv preprint arXiv:2604.20472, 2026
Pith/arXiv arXiv 2026
-
[7]
Dropout as a bayesian approximation: Representing model uncertainty in deep learning
Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. InProceedings of the 33rd International Conference on Machine Learning (ICML), volume 48, pages 1050–1059, 2016
2016
-
[8]
SPUQ: Perturbation-based uncertainty quan- tification for large language models
Xiang Gao, Jiaxin Zhang, Lalla Mouatadid, and Kama- lika Das. SPUQ: Perturbation-based uncertainty quan- tification for large language models. InProceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pages 2336–2346. Association for Computational Linguistics, 2024
2024
-
[9]
SAFE: Multitask failure detection for vision- language-action models
Qiao Gu, Yuanliang Ju, Shengxiang Sun, Igor Gilitschen- ski, Haruki Nishimura, Masha Itkina, and Florian Shkurti. SAFE: Multitask failure detection for vision- language-action models. InAdvances in Neural Infor- mation Processing Systems (NeurIPS), 2025
2025
-
[10]
Ask before you act: Token-level uncertainty for intervention in vision-language-action models
Ulas Berk Karli, Tetsu Kurumisawa, and Tesca Fitzger- ald. Ask before you act: Token-level uncertainty for intervention in vision-language-action models. RSS 2025 Workshop on Out-of-Distribution Generalization in Robot Learning, 2025. URL https://openreview.net/ forum?id=NX0euXAv98
2025
-
[11]
What uncertainties do we need in bayesian deep learning for computer vision? InAdvances in Neural Information Processing Systems (NeurIPS), 2017
Alex Kendall and Yarin Gal. What uncertainties do we need in bayesian deep learning for computer vision? InAdvances in Neural Information Processing Systems (NeurIPS), 2017
2017
-
[12]
Consistency and uncertainty: Identifying unreliable responses from black-box vision- language models for selective visual question answering
Zaid Khan and Yun Fu. Consistency and uncertainty: Identifying unreliable responses from black-box vision- language models for selective visual question answering. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), 2024
2024
-
[13]
Openvla: An open-source vision-language-action model.arXiv preprint arXiv:2406.09246, 2024
Moo Jin Kim, Karl Pertsch, Siddharth Karamcheti, Ted Xiao, Ashwin Balakrishna, Suraj Nair, Rafael Rafailov, Ethan Foster, Grace Lam, Pannag Sanketi, et al. Openvla: An open-source vision-language-action model.arXiv preprint arXiv:2406.09246, 2024
Pith/arXiv arXiv 2024
-
[14]
Moo Jin Kim, Chelsea Finn, and Percy Liang. Fine- tuning vision-language-action models: Optimizing speed and success.arXiv preprint arXiv:2502.19645, 2025
Pith/arXiv arXiv 2025
-
[15]
Simple and scalable predictive uncertainty estimation using deep ensembles
Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty estimation using deep ensembles. InAdvances in Neural Information Processing Systems (NeurIPS), 2017
2017
-
[16]
Generat- ing with confidence: Uncertainty quantification for black- box large language models.Transactions on Machine Learning Research (TMLR), 2024
Zhen Lin, Shubhendu Trivedi, and Jimeng Sun. Generat- ing with confidence: Uncertainty quantification for black- box large language models.Transactions on Machine Learning Research (TMLR), 2024
2024
-
[17]
LIBERO: Bench- marking knowledge transfer for lifelong robot learning
Bo Liu, Yifeng Zhu, Chongkai Gao, Yihao Feng, Qiang Liu, Yuke Zhu, and Peter Stone. LIBERO: Bench- marking knowledge transfer for lifelong robot learning. InAdvances in Neural Information Processing Systems, volume 36, 2023
2023
-
[18]
Enhancing hallucination detection through noise injection
Litian Liu, Reza Pourreza, Sunny Panchal, Apratim Bhat- tacharyya, Yubing Jian, Yao Qin, and Roland Memise- vic. Enhancing hallucination detection through noise injection. InInternational Conference on Learning Representations (ICLR), 2026
2026
-
[19]
Epistemic uncertainty for generated image detection
Jun Nie, Yonggang Zhang, Tongliang Liu, Yiu ming Che- ung, Bo Han, and Xinmei Tian. Epistemic uncertainty for generated image detection. InAdvances in Neural Information Processing Systems (NeurIPS), 2025
2025
-
[20]
Position: Bayesian deep learning is needed in the age of large-scale ai
Theodore Papamarkou, Maria Skoularidou, Konstantina Palla, et al. Position: Bayesian deep learning is needed in the age of large-scale ai. InProceedings of the 41st International Conference on Machine Learning, volume 235 ofProceedings of Machine Learning Research, pages 39556–39586. PMLR, 2024. URL https://proceedings. mlr.press/v235/papamarkou24b.html
2024
-
[21]
Schoellig
Ralf R ¨omer, Adrian Kobras, Luca Worbis, and Angela P. Schoellig. Failure prediction at runtime for generative robot policies. InAdvances in Neural Information Processing Systems (NeurIPS), 2025
2025
-
[22]
Epistemic uncertainty quan- tification for pre-trained neural networks
Hanjing Wang and Qiang Ji. Epistemic uncertainty quan- tification for pre-trained neural networks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
2024
-
[23]
Shortcut learning in generalist robot policies: The role of dataset diversity and fragmentation
Youguang Xing, Xu Luo, Junlin Xie, Lianli Gao, Heng Tao Shen, and Jingkuan Song. Shortcut learning in generalist robot policies: The role of dataset diversity and fragmentation. InConference on Robot Learning (CoRL), 2025
2025
-
[24]
Ruiyang Zhang, Hu Zhang, and Zhedong Zheng. Vl- uncertainty: Detecting hallucination in large vision- language model via uncertainty estimation.arXiv preprint arXiv:2411.11919, 2024
arXiv 2024
-
[25]
Xueyang Zhou, Yangming Xu, Guiyao Tie, Yongchao Chen, Guowen Zhang, Duanfeng Chu, Pan Zhou, and Lichao Sun. Libero-pro: Towards robust and fair eval- uation of vision-language-action models beyond memo- rization.arXiv preprint arXiv:2510.03827, 2025
Pith/arXiv arXiv 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.