pith. machine review for the scientific record. sign in

arxiv: 2605.07088 · v1 · submitted 2026-05-08 · 💻 cs.CR

Recognition: 2 theorem links

· Lean Theorem

Membership Inference Attacks on Vision-Language-Action Models

Amir Houmansadr, Kejing Xia, Mingzhe Li, Renhao Zhang, Yuefeng Peng

Authors on Pith no claims yet

Pith reviewed 2026-05-11 00:50 UTC · model grok-4.3

classification 💻 cs.CR
keywords membership inference attacksvision-language-action modelsprivacyembodied AIroboticsblack-box attackstrajectory inference
0
0 comments X

The pith

VLA models are highly vulnerable to membership inference attacks, including black-box ones based only on generated actions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that vision-language-action models, which control robots by generating actions from visual and language inputs, can be attacked to determine if particular data samples or full trajectories were part of their training set. This matters because these models are trained on small embodied datasets for many epochs and expose observable action behaviors that leak membership information. The authors develop attacks that use both standard signals like token likelihood and VLA-specific ones like action errors and temporal patterns. They show strong performance across benchmarks, highlighting a privacy risk unique to embodied AI systems.

Core claim

VLA models differ from LLMs and VLMs by being fine-tuned on small datasets, operating in constrained action spaces, and exposing executable action outputs that are temporally correlated. This creates a distinct attack surface where membership inference can be performed at sample or trajectory level, even under black-box access using only generated actions.

What carries the argument

The suite of attack methods that exploit observable action errors and temporal motion patterns in addition to classic MIA signals.

If this is right

  • Sample-level inference over individual transitions and trajectory-level over complete demonstrations are both feasible.
  • Strict black-box attacks relying solely on generated actions achieve strong performance.
  • These vulnerabilities apply across multiple VLA benchmarks and representative models.
  • Deployed embodied AI systems face practical privacy risks from observable behaviors.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If true, robotic systems using VLA models may need to obscure action outputs or use privacy-preserving training to prevent data leakage.
  • Similar vulnerabilities could exist in other action-generating models beyond VLA, such as in autonomous driving.
  • Testing on larger-scale real-world robot deployments would be a natural next step to validate the attack effectiveness outside lab settings.

Load-bearing premise

That the selected VLA models, benchmarks, and access regimes represent real deployed embodied systems where action outputs and temporal patterns stay informative.

What would settle it

Observing that the attacks perform no better than random guessing when applied to VLA models trained on much larger datasets or when action outputs are modified to hide error patterns.

Figures

Figures reproduced from arXiv: 2605.07088 by Amir Houmansadr, Kejing Xia, Mingzhe Li, Renhao Zhang, Yuefeng Peng.

Figure 1
Figure 1. Figure 1: Overview of MIA on VLA models. We study sample-level attacks on individual transition [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Ablation studies on OpenVLA using LIBERO-Spatial. (a) Sample-level MIA AUC under [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Privacy-utility trade-off of mitiga￾tions on OpenVLA with LIBERO-Spatial. We evaluate mitigations against action-based MIAs, which remain feasible under black-box access and thus form a practical leakage channel in deployed VLA sys￾tems. We consider five defenses: Gaussian action noise, action rounding, stochastic decoding, image jitter, and MC dropout, each with multiple settings; details are in Appendix … view at source ↗
read the original abstract

Membership inference attacks (MIAs) have been extensively studied in large language models (LLMs) and vision-language models (VLMs), yet their implications for vision-language-action (VLA) models remain largely unexplored. VLA models differ from standard LLMs and VLMs in several important ways: they are often fine-tuned for many epochs on relatively small embodied datasets, operate over constrained and structured action spaces, and expose action outputs that can be observed as executable behaviors and temporally correlated trajectories. These characteristics suggest a distinct and potentially more informative attack surface for membership inference. In this work, we present the first systematic study of MIAs against VLA systems. We formalize two membership inference settings for VLA models: sample-level inference over individual transition samples and trajectory-level inference over complete embodied demonstrations. We further develop a suite of attack methods under multiple access regimes, including strict black-box access. Our attacks exploit both classic MIA signals, such as token likelihood, and VLA-specific signals, such as observable action errors and temporal motion patterns. Across multiple VLA benchmarks and representative VLA models, these attacks achieve strong inference performance, showing that VLA models are highly vulnerable to membership inference. Notably, black-box attacks based only on generated actions achieve strong performance, highlighting a practical privacy risk for deployed embodied AI systems. Our findings reveal a previously underexplored privacy risk in robotic and embodied AI, and underscore the need for dedicated privacy evaluation and defenses for VLA models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents the first systematic study of membership inference attacks (MIAs) on vision-language-action (VLA) models. It formalizes sample-level and trajectory-level inference settings, develops a suite of attacks under multiple access regimes (including strict black-box), and exploits both standard signals (e.g., token likelihood) and VLA-specific signals (observable action errors and temporal motion patterns). Experiments across multiple VLA benchmarks and representative models report strong attack performance, with the notable result that black-box attacks using only generated actions succeed, indicating practical privacy risks for embodied AI systems.

Significance. If the empirical results hold, this work is significant as the first dedicated exploration of MIAs in the VLA domain, extending prior work on LLMs and VLMs by highlighting how fine-tuning on small embodied datasets and observable action outputs create a distinct attack surface. The black-box success from action outputs alone is a concrete strength, as it points to risks in deployed robotic systems where full access is unavailable. The paper earns credit for its systematic multi-regime evaluation and for identifying a previously underexplored privacy issue in embodied AI.

major comments (2)
  1. [Abstract] Abstract: the headline claim that 'black-box attacks based only on generated actions achieve strong performance' and that 'VLA models are highly vulnerable' is asserted without any quantitative metrics, success rates, baselines, or error bars. This makes the central empirical conclusion difficult to evaluate and load-bearing for the practical-risk assertion.
  2. [§4 and §5] §4 (Experiments) and §5 (Discussion): the evaluation is confined to standard academic VLA benchmarks and models, which are typically small, low-noise, curated trajectories. The paper does not test or discuss degradation of the action-error and temporal-pattern signals under realistic conditions (higher action noise, physical feedback, distribution shift). This directly weakens the claim of 'practical privacy risk for deployed embodied AI systems' and requires either additional experiments or a clearer limitations statement.
minor comments (2)
  1. Ensure every VLA model, benchmark, and access regime is explicitly named with citations in the experimental section for reproducibility.
  2. [§3] Clarify the exact definitions of sample-level vs. trajectory-level inference early in §3 to avoid ambiguity when presenting results.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed review. We address each major comment below, indicating the changes we will make to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the headline claim that 'black-box attacks based only on generated actions achieve strong performance' and that 'VLA models are highly vulnerable' is asserted without any quantitative metrics, success rates, baselines, or error bars. This makes the central empirical conclusion difficult to evaluate and load-bearing for the practical-risk assertion.

    Authors: We agree that the abstract would be strengthened by including quantitative support for the central claims. The experimental results in Section 4 already report specific metrics (including AUC values, accuracy, and baseline comparisons) for the black-box action-based attacks. In the revised manuscript we will update the abstract to incorporate representative quantitative results from these experiments, along with error bars where applicable, to make the empirical conclusions more concrete and directly evaluable. revision: yes

  2. Referee: [§4 and §5] §4 (Experiments) and §5 (Discussion): the evaluation is confined to standard academic VLA benchmarks and models, which are typically small, low-noise, curated trajectories. The paper does not test or discuss degradation of the action-error and temporal-pattern signals under realistic conditions (higher action noise, physical feedback, distribution shift). This directly weakens the claim of 'practical privacy risk for deployed embodied AI systems' and requires either additional experiments or a clearer limitations statement.

    Authors: We acknowledge that our evaluation uses standard academic benchmarks, which limits direct extrapolation to noisy real-world robotic settings. Performing new experiments under physical conditions with higher noise and distribution shift is beyond the scope and resources of the current study. We will therefore revise Section 5 to add a dedicated limitations paragraph that explicitly discusses the potential degradation of action-error and temporal-pattern signals under realistic noise, feedback, and shift conditions, and the resulting implications for claims about deployed embodied systems. revision: partial

Circularity Check

0 steps flagged

No circularity: purely empirical attack evaluation on external benchmarks

full rationale

The paper is an empirical study that formalizes MIA settings for VLA models, constructs attacks using observable signals (token likelihood, action errors, temporal patterns), and reports performance on standard VLA benchmarks and models. No equations, derivations, or fitted parameters are presented as predictions; results are direct measurements against held-out data. No self-citation chains or ansatzes underpin the central claims. The evaluation is self-contained against external signals and datasets.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

No free parameters, invented entities, or ad-hoc axioms introduced; work adapts standard MIA techniques to VLA domain assumptions stated in the abstract.

axioms (2)
  • domain assumption VLA models are fine-tuned for many epochs on relatively small embodied datasets with constrained action spaces
    Stated directly in abstract as key difference enabling distinct attack surface
  • domain assumption Action outputs are observable as executable behaviors and temporally correlated trajectories
    Used to justify VLA-specific signals like motion patterns

pith-pipeline@v0.9.0 · 5571 in / 1173 out tokens · 54581 ms · 2026-05-11T00:50:15.419584+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

38 extracted references · 10 canonical work pages · 5 internal anchors

  1. [1]

    π0.5: A vision- language-action model with open-world generalization

    Kevin Black, Noah Brown, James Darpinian, Karan Dhabalia, Danny Driess, Adnan Esmail, Michael Robert Equi, Chelsea Finn, Niccolo Fusai, Manuel Y Galliker, et al. π0.5: A vision- language-action model with open-world generalization. In9th Annual Conference on Robot Learning, 2025

  2. [2]

    $\pi_0$: A Vision-Language-Action Flow Model for General Robot Control

    Kevin Black, Noah Brown, Danny Driess, Adnan Esmail, Michael Equi, Chelsea Finn, Niccolo Fusai, Lachy Groom, Karol Hausman, Brian Ichter, et al. π0: A vision-language-action flow model for general robot control. InarXiv preprint arXiv:2410.24164, 2024

  3. [3]

    RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

    Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choro- manski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, Pete Florence, Chuyuan Fu, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Kehang Han, Karol Hausman, Alex Herzog, Jasmine Hsu, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Dmitry Kalashnikov, Y...

  4. [4]

    RT-1: Robotics Transformer for Real-World Control at Scale

    Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Tomas Jackson, Sally Jesmonth, Nikhil Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Kuang-Huei Lee, Sergey Levine, Yao Lu, Utsav Malla, De...

  5. [5]

    Membership inference attacks from first principles

    Nicholas Carlini, Steve Chien, Milad Nasr, Shuang Song, Andreas Terzis, and Florian Tramer. Membership inference attacks from first principles. In2022 IEEE Symposium on Security and Privacy (SP), pages 1897–1914. IEEE, 2022

  6. [6]

    Context-aware membership inference attacks against pre-trained large language models

    Hongyan Chang, Ali Shahin Shamsabadi, Kleomenis Katevas, Hamed Haddadi, and Reza Shokri. Context-aware membership inference attacks against pre-trained large language models. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 7299–7321, 2025

  7. [7]

    Manipulation facing threats: Evaluating physical vulnerabilities in end-to-end vision language action models.arXiv preprint arXiv:2409.13174, 2024

    Hao Cheng, Erjia Xiao, Yichi Wang, Chengyuan Yu, Mengshu Sun, Qiang Zhang, Jiahang Cao, Yijie Guo, Ning Liu, Kaidi Xu, et al. Manipulation facing threats: Evaluating physical vulnerabilities in end-to-end vision language action models.arXiv preprint arXiv:2409.13174, 2024

  8. [8]

    Do membership inference attacks work on large language models? InFirst Conference on Language Modeling, 2024

    Michael Duan, Anshuman Suri, Niloofar Mireshghallah, Sewon Min, Weijia Shi, Luke Zettle- moyer, Yulia Tsvetkov, Yejin Choi, David Evans, and Hannaneh Hajishirzi. Do membership inference attacks work on large language models? InFirst Conference on Language Modeling, 2024

  9. [9]

    Orion: A holistic end-to-end autonomous driving framework by vision-language instructed action generation

    Haoyu Fu, Diankun Zhang, Zongchuang Zhao, Jianfeng Cui, Dingkang Liang, Chong Zhang, Dingyuan Zhang, Hongwei Xie, Bing Wang, and Xiang Bai. Orion: A holistic end-to-end autonomous driving framework by vision-language instructed action generation. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 24823–24834, 2025

  10. [10]

    Membership inference attacks against fine-tuned large language models via self-prompt calibration.Advances in Neural Information Processing Systems, 37:134981–135010, 2024

    Wenjie Fu, Huandong Wang, Chen Gao, Guanghua Liu, Yong Li, and Tao Jiang. Membership inference attacks against fine-tuned large language models via self-prompt calibration.Advances in Neural Information Processing Systems, 37:134981–135010, 2024. 10

  11. [11]

    Jamie Hayes, Ilia Shumailov, Christopher A. Choquette-Choo, Matthew Jagielski, Georgios Kaissis, Milad Nasr, Meenatchi Sundaram Muthu Selva Annamalai, Niloofar Mireshghallah, Igor Shilov, Matthieu Meeus, Yves-Alexandre de Montjoye, Katherine Lee, Franziska Boenisch, Adam Dziedzic, and A. Feder Cooper. Exploring the limits of strong membership inference at...

  12. [12]

    Membership inference attacks against {Vision-Language} models

    Yuke Hu, Zheng Li, Zhihao Liu, Yang Zhang, Zhan Qin, Kui Ren, and Chun Chen. Membership inference attacks against {Vision-Language} models. In34th USENIX Security Symposium (USENIX Security 25), pages 1589–1608, 2025

  13. [13]

    DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

    Alexander Khazatsky, Karl Pertsch, Suraj Nair, Ashwin Balakrishna, Sudeep Dasari, Siddharth Karamcheti, Soroush Nasiriany, Mohan Kumar Srirama, Lawrence Yunliang Chen, Kirsty Ellis, et al. Droid: A large-scale in-the-wild robot manipulation dataset.arXiv preprint arXiv:2403.12945, 2024

  14. [14]

    Open- VLA: An open-source vision-language-action model

    Moo Jin Kim, Karl Pertsch, Siddharth Karamcheti, Ted Xiao, Ashwin Balakrishna, Suraj Nair, Rafael Rafailov, Ethan P Foster, Pannag R Sanketi, Quan Vuong, Thomas Kollar, Benjamin Burchfiel, Russ Tedrake, Dorsa Sadigh, Sergey Levine, Percy Liang, and Chelsea Finn. Open- VLA: An open-source vision-language-action model. In8th Annual Conference on Robot Learn...

  15. [15]

    Attackvla: Bench- marking adversarial and backdoor attacks on vision-language-action models.arXiv preprint arXiv:2511.12149, 2025

    Jiayu Li, Yunhan Zhao, Xiang Zheng, Zonghuan Xu, Yige Li, Xingjun Ma, and Yu-Gang Jiang. Attackvla: Benchmarking adversarial and backdoor attacks on vision-language-action models. arXiv preprint arXiv:2511.12149, 2025

  16. [16]

    Robonurse- vla: Robotic scrub nurse system based on vision-language-action model

    Shunlei Li, Jin Wang, Rui Dai, Wanyu Ma, Wing Yin Ng, Yingbai Hu, and Zheng Li. Robonurse- vla: Robotic scrub nurse system based on vision-language-action model. In2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3986–3993. IEEE, 2025

  17. [17]

    Membership inference attacks against large vision-language models.Advances in Neural Information Processing Systems, 37:98645–98674, 2024

    Zhan Li, Yongtao Wu, Yihang Chen, Francesco Tonin, Elias Abad Rocamora, and V olkan Cevher. Membership inference attacks against large vision-language models.Advances in Neural Information Processing Systems, 37:98645–98674, 2024

  18. [18]

    Libero: Benchmarking knowledge transfer for lifelong robot learning.Advances in Neural Information Processing Systems, 36:44776–44791, 2023

    Bo Liu, Yifeng Zhu, Chongkai Gao, Yihao Feng, Qiang Liu, Yuke Zhu, and Peter Stone. Libero: Benchmarking knowledge transfer for lifelong robot learning.Advances in Neural Information Processing Systems, 36:44776–44791, 2023

  19. [19]

    LOMIA: Label-only membership inference attacks against pre-trained large vision-language models

    Yihao LIU, Xinqi LYU, Dong Wang, Yanjie Li, and Bin Xiao. LOMIA: Label-only membership inference attacks against pre-trained large vision-language models. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

  20. [20]

    Llm dataset inference: Did you train on my dataset?Advances in Neural Information Processing Systems, 37:124069– 124092, 2024

    Pratyush Maini, Hengrui Jia, Nicolas Papernot, and Adam Dziedzic. Llm dataset inference: Did you train on my dataset?Advances in Neural Information Processing Systems, 37:124069– 124092, 2024

  21. [21]

    Membership inference attacks against language models via neighbourhood comparison

    Justus Mattern, Fatemehsadat Mireshghallah, Zhijing Jin, Bernhard Schölkopf, Mrinmaya Sachan, and Taylor Berg-Kirkpatrick. Membership inference attacks against language models via neighbourhood comparison. InFindings of the Association for Computational Linguistics: ACL 2023, pages 11330–11343, 2023

  22. [22]

    Did the neurons read your book? document-level membership inference for large language models

    Matthieu Meeus, Shubham Jain, Marek Rei, and Yves-Alexandre de Montjoye. Did the neurons read your book? document-level membership inference for large language models. In33rd USENIX Security Symposium (USENIX Security 24), pages 2369–2385, 2024

  23. [23]

    Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning

    Milad Nasr, Reza Shokri, and Amir Houmansadr. Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. In2019 IEEE symposium on security and privacy (SP), pages 739–753. IEEE, 2019

  24. [24]

    Open x- embodiment: Robotic learning datasets and rt-x models: Open x-embodiment collaboration 0

    Abby O’Neill, Abdul Rehman, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, Ajinkya Jain, et al. Open x- embodiment: Robotic learning datasets and rt-x models: Open x-embodiment collaboration 0. 11 In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 6892–6903. IEEE, 2024

  25. [25]

    OSLO: One-shot label- only membership inference attacks

    Yuefeng Peng, Jaechul Roh, Subhransu Maji, and Amir Houmansadr. OSLO: One-shot label- only membership inference attacks. InThe Thirty-eighth Annual Conference on Neural Infor- mation Processing Systems, 2024

  26. [26]

    Scaling up membership inference: When and how attacks succeed on large language models

    Haritz Puerto, Martin Gubri, Sangdoo Yun, and Seong Joon Oh. Scaling up membership inference: When and how attacks succeed on large language models. InFindings of the Association for Computational Linguistics: NAACL 2025, pages 4165–4182, 2025

  27. [27]

    Self-comparison for dataset-level membership inference in large (vision-) language model

    Jie Ren, Kangrui Chen, Chen Chen, Vikash Sehwag, Yue Xing, Jiliang Tang, and Lingjuan Lyu. Self-comparison for dataset-level membership inference in large (vision-) language model. In Proceedings of the ACM on Web Conference 2025, pages 910–920, 2025

  28. [28]

    Detecting pretraining data from large language models

    Weijia Shi, Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu, Terra Blevins, Danqi Chen, and Luke Zettlemoyer. Detecting pretraining data from large language models. InThe Twelfth International Conference on Learning Representations, 2024

  29. [29]

    Membership inference attacks against machine learning models

    Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. Membership inference attacks against machine learning models. In2017 IEEE symposium on security and privacy (SP), pages 3–18. IEEE, 2017

  30. [30]

    DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models

    Xiaoyu Tian, Junru Gu, Bailin Li, Yicheng Liu, Yang Wang, Zhiyong Zhao, Kun Zhan, Peng Jia, Xianpeng Lang, and Hang Zhao. Drivevlm: The convergence of autonomous driving and large vision-language models.arXiv preprint arXiv:2402.12289, 2024

  31. [31]

    How much of my dataset did you use? quantitative data usage inference in machine learning

    Yao Tong, Jiayuan Ye, Sajjad Zarifzadeh, and Reza Shokri. How much of my dataset did you use? quantitative data usage inference in machine learning. InThe Thirteenth International Conference on Learning Representations, 2025

  32. [32]

    Exploring the adversarial vulnerabilities of vision- language-action models in robotics

    Taowen Wang, Cheng Han, James Liang, Wenhao Yang, Dongfang Liu, Luna Xinyu Zhang, Qifan Wang, Jiebo Luo, and Ruixiang Tang. Exploring the adversarial vulnerabilities of vision- language-action models in robotics. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 6948–6958, 2025

  33. [33]

    Advedm: Fine-grained adversarial attack against vlm-based embodied agents

    Yichen Wang, Hangtao Zhang, Hewen Pan, Ziqi Zhou, Xianlong Wang, Peijin Guo, Lulu Xue, Shengshan Hu, Minghui Li, and Leo Yu Zhang. Advedm: Fine-grained adversarial attack against vlm-based embodied agents. InAdvances in Neural Information Processing Systems, 2025

  34. [34]

    Zonghuan

    Zonghuan Xu, Xiang Zheng, Xingjun Ma, and Yu-Gang Jiang. Tabvla: Targeted backdoor attacks on vision-language-action models.arXiv preprint arXiv:2510.10932, 2025

  35. [35]

    When alignment fails: Multimodal adversarial attacks on vision-language-action models

    Yuping Yan, Yuhan Xie, Yixin Zhang, Lingjuan Lyu, Handing Wang, and Yaochu Jin. When alignment fails: Multimodal adversarial attacks on vision-language-action models.arXiv preprint arXiv:2511.16203, 2025

  36. [36]

    Privacy risk in machine learning: Analyzing the connection to overfitting

    Samuel Yeom, Irene Giacomelli, Matt Fredrikson, and Somesh Jha. Privacy risk in machine learning: Analyzing the connection to overfitting. In2018 IEEE 31st computer security foundations symposium (CSF), pages 268–282. IEEE, 2018

  37. [37]

    Black-box membership inference attack for LVLMs via prior knowledge- calibrated memory probing

    Jinhua Yin, Peiru Yang, Chen Yang, Huili Wang, Zhiyang Hu, Shangguang Wang, Yongfeng Huang, and Tao Qi. Black-box membership inference attack for LVLMs via prior knowledge- calibrated memory probing. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

  38. [38]

    Badvla: Towards backdoor attacks on vision-language-action models via objective-decoupled optimization.Advances in Neural Information Processing Systems, 2025

    Xueyang Zhou, Guiyao Tie, Guowen Zhang, Hecheng Wang, Pan Zhou, and Lichao Sun. Badvla: Towards backdoor attacks on vision-language-action models via objective-decoupled optimization.Advances in Neural Information Processing Systems, 2025. 12 A Appendix A.1 Additional Implementation Details Fine-tuning setup.Unless otherwise specified, we follow the offic...