pith. machine review for the scientific record. sign in

arxiv: 2605.14291 · v1 · submitted 2026-05-14 · 💻 cs.CR · cs.AI· cs.CL· cs.CV· cs.LG

Recognition: no theorem link

To See is Not to Learn: Protecting Multimodal Data from Unauthorized Fine-Tuning of Large Vision-Language Model

Authors on Pith no claims yet

Pith reviewed 2026-05-15 02:35 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.CLcs.CVcs.LG
keywords unlearnable examplesmultimodal data protectionlarge vision-language modelsunauthorized fine-tuningperturbation injectioncross-modal disruptionproactive defenseLVLM security
0
0 comments X

The pith

Data owners can add invisible perturbations to images and text to stop large vision-language models from learning real content during unauthorized fine-tuning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces MMGuard as a proactive defense that lets owners protect multimodal data before it is scraped and used for fine-tuning large vision-language models. It injects tiny perturbations that create an optimization shortcut during training, so the model overfits to the noise and loses performance when the noise is absent at test time. A cross-modal binding disruption further shifts attention to tie the noise to the training targets, backed by theoretical guarantees and improved by an ensemble strategy for transfer across models. This is shown to work against nine open-source LVLMs on six datasets in white-box, gray-box, and black-box settings. A sympathetic reader would care because it moves protection upstream, before infringement happens, unlike post-hoc unlearning or watermarking.

Core claim

MMGuard generates unlearnable examples by injecting human-imperceptible perturbations that exploit LVLM learning dynamics to create an optimization shortcut, causing the model to overfit to noise rather than content and degrading downstream performance when the perturbation is removed. It adds cross-modal binding disruption to strategically shift attention and enforce spurious correlations between the noise and training targets with theoretical guarantees, then uses an ensemble learning strategy to boost cross-model transferability, with evaluations showing effective, stealthy, and robust protection under multiple threat models.

What carries the argument

Perturbation injection that minimizes training loss via an optimization shortcut, paired with cross-modal binding disruption to enforce noise-target spurious correlations.

If this is right

  • Fine-tuning on protected data produces models that perform poorly on clean inputs at inference time.
  • The defense holds under white-box, gray-box, and black-box threat models.
  • Ensemble perturbations enable the protection to transfer across different LVLM architectures.
  • Owners gain a tool that acts before data is scraped, reducing reliance on after-the-fact remedies.
  • Protection remains stealthy to humans while disrupting model learning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If widely used, this could make scraped web data less useful for training, potentially reducing incentives for unauthorized collection.
  • Similar perturbation approaches might extend to protecting data for other model types or single-modality tasks.
  • Training pipelines might need built-in checks to detect or handle such protected data.
  • Over time, routine use could shift norms around public multimodal data availability.

Load-bearing premise

The injected perturbations will reliably force the model to overfit to noise instead of content because the attention and loss landscape behave predictably under the chosen strategy.

What would settle it

Fine-tune an LVLM on MMGuard-protected data, then measure its accuracy on clean test data without the perturbations and compare to accuracy after fine-tuning on the same data without protection.

Figures

Figures reproduced from arXiv: 2605.14291 by Chengshuai Zhao, Dawei Li, Huan Liu, Zhen Tan, Zhiyuan Yu.

Figure 1
Figure 1. Figure 1: Protect multimodal data from unauthorized fine-tuning of LVLM. 1. Introduction Large Vision-Language Models (LVLMs) (Singh et al., 2025; Comanici et al., 2025) have rapidly become a cen￾tral component of modern artificial intelligence systems. By aligning visual content with natural-language instruc￾tions and responses, LVLMs support a wide range of mul￾timodal tasks, including visual question answering (B… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of MMGUARD. The defender generates protected multimodal examples by coupling image-unlearnable perturbations with text-unlearnable triggers and using cross-modal binding disruption to steer LVLM attention toward protection-specific shortcuts that fail to transfer to the downstream task. 5.2. Unlearnable Text Protection Given a clean text input ti , we construct the protected text t˜i by inserting … view at source ↗
Figure 3
Figure 3. Figure 3: Protection effectiveness across MMGUARD variants under white-box and gray-box scenarios. Lower bars indicate stronger protection. 20 40 60 80 10 −3 10 −2 10 −1 10 0 Loss RealworldQA 50 100 150 10 −3 10 −2 10 −1 10 0 MMStar 20 40 60 80 100 Training Step 10 −3 10 −2 10 −1 10 0 Loss ScienceQA 20 40 60 80 100 Training Step 10 −2 10 −1 10 0 10 1 DocVQA Fine-Tuning (Clean) MMGuard-BPH MMGuard-BPH-Max MMGuard-CRS… view at source ↗
Figure 4
Figure 4. Figure 4: Attacker training loss on protected data (log scale). Min￾min variants converge to lower loss than CLEAN FT, indicating shortcut fitting, while MAX variants plateau at higher loss, indicat￾ing training disruption. TextVQA, DocVQA, MMStar, and RealworldQA, where performance depends heavily on image-text alignment. The protection also transfers to gray-box target models. Al￾though the reduction varies with m… view at source ↗
Figure 5
Figure 5. Figure 5: Protection transferability of MMGUARD-BPH and MM￾GUARD-CRS across attacker fine-tuning strategies. Lower bars indicate stronger protection. than a reliable protection signal. In contrast, all MMGUARD variants produce positive accuracy drops across the evalu￾ations. The gains are modest on saturated tasks such as ScienceQA, where Clean Fine-Tuning is near the ceiling, but are more pronounced on visually gro… view at source ↗
Figure 6
Figure 6. Figure 6: Protection effectiveness under attacker-side data mixing. Lower curves indicate stronger protection. FT BPH CRS 88 91 94 97 100 Accuracy ScienceQA FT BPH CRS 46 48 50 52 54 VQA-RAD FT BPH CRS 60 70 80 90 100 Accuracy TextVQA FT BPH CRS 50 55 60 65 70 DocVQA None RCP JPEG Blur Punct Case WS RCP+Punct JPEG+Case Blur+WS [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Protection robustness of MMGUARD-BPH and MM￾GUARD-CRS under attacker-side data transformations. Lower bars indicate stronger protection. formations can partially distort the perturbation. Text-side normalizers are less effective because the trigger consists of admissible vocabulary tokens whose form and semantics are largely preserved. Overall, MMGUARD remains robust against standard attacker-side data tra… view at source ↗
Figure 8
Figure 8. Figure 8: Ablation study and parameter sensitivity of MMGUARD-BPH. The leftmost panel reports nine ablation conditions on the full method, while the remaining five panels sweep the binding-loss weight λbind, inner-loop steps Q, gradient layer depth |K|, image perturbation budget ϵx, and text trigger budget ϵt. Lower values indicate stronger protection across all panels. The results are averages across six datasets. … view at source ↗
Figure 9
Figure 9. Figure 9: Visualization of the cross-modal binding mechanism on a representative sample. Row 1: released images; Row 2: perturbation and perturbation tokens Ωδ (cyan boxes); Row 3: answer-token attention maps; Row 4: head-averaged attention mass distribution with corresponding layerwise curves. suggesting that the relevant cross-modal binding behavior emerges early in multimodal fusion. Finally, both image and text … view at source ↗
Figure 10
Figure 10. Figure 10: Per-sample protection examples on RealWorldQA. Rows (top to bottom): Clean, MMGUARD-BPH, MMGUARD-CRS, MMGUARD-BPH-Max, and MMGUARD-CRS-Max. Columns: samples selected randomly to span the dataset’s aspect-ratio range. Within each row, cells share a fixed pixel height and are concatenated edge-to-edge at native aspect; within each column, the same source image appears under all five protection variants, so … view at source ↗
Figure 11
Figure 11. Figure 11: Per-sample protection examples on MMStar. Layout follows [PITH_FULL_IMAGE:figures/full_fig_p033_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Per-sample protection examples on ScienceQA. Layout follows [PITH_FULL_IMAGE:figures/full_fig_p033_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Per-sample protection examples on VQA-RAD. Layout follows [PITH_FULL_IMAGE:figures/full_fig_p034_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Per-sample protection examples on TextVQA. Layout follows [PITH_FULL_IMAGE:figures/full_fig_p034_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Per-sample protection examples on DocVQA. Layout follows [PITH_FULL_IMAGE:figures/full_fig_p034_15.png] view at source ↗
read the original abstract

The rapid advancement of Large Vision-Language Models (LVLMs) is increasingly accompanied by unauthorized scraping and training on multimodal web data, posing severe copyright and privacy risks to data owners. Existing countermeasures, such as machine unlearning and watermarks, are inherent post-hoc approaches that act only after intellectual property infringement has already occurred. In this work, we propose MMGuard to empower data owners to proactively protect their multimodal data against unauthorized LVLM fine-tuning. MMGuard generates unlearnable examples by injecting human-imperceptible perturbations that actively exploit the learning dynamics of LVLMs. By minimizing the training loss, the perturbation creates an optimization shortcut, causing the model to overfit to the noise and thereby degrading downstream performance when the perturbation is absent during inference. To further strengthen this defense, MMGuard introduces a cross-modal binding disruption, strategically shifting LVLM attention to enforce a spurious correlation between the noise and the training target with theoretical guarantees. Enhanced by an ensemble learning strategy for cross-model transferability, MMGuard is evaluated against nine open-source LVLMs across six datasets. Our comprehensive results demonstrate effective, stealthy, and robust protection under white-box, gray-box, and black-box threat models, establishing a mechanistic advantage in proactively defending against aggressive fine-tuning exploitation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes MMGuard, a proactive method to protect multimodal data from unauthorized LVLM fine-tuning. It generates unlearnable examples via human-imperceptible perturbations that create an optimization shortcut (forcing overfitting to noise) and a cross-modal binding disruption that enforces spurious noise-target correlations with theoretical guarantees. An ensemble strategy improves cross-model transferability. The approach is evaluated on nine open-source LVLMs across six datasets under white-box, gray-box, and black-box threat models, claiming effective, stealthy, and robust protection.

Significance. If the mechanistic claims and theoretical guarantees hold after verification, this would offer a meaningful shift from post-hoc defenses (unlearning, watermarks) to proactive data protection, with potential impact on copyright and privacy practices for web-scraped multimodal training data.

major comments (2)
  1. [Abstract] Abstract: The claim that cross-modal binding disruption 'enforces a spurious correlation between the noise and the training target with theoretical guarantees' is load-bearing for the robustness argument across nine architectures, yet no derivation, equation, or section is referenced showing how the guarantee follows from the perturbation strategy; if attention heads route around the noise differently than assumed, the downstream degradation fails.
  2. [Evaluation] Evaluation section: Results on nine models and six datasets are presented without ablation studies isolating the binding term, error analysis, or mechanistic verification that the optimization shortcut (rather than other factors) drives the observed protection; this leaves the central claim of a 'mechanistic advantage' unconfirmed by the reported experiments alone.
minor comments (1)
  1. The abstract refers to 'human-imperceptible perturbations' and 'ensemble learning strategy' without specifying the perturbation generation algorithm, imperceptibility metrics (e.g., PSNR/SSIM bounds), or ensemble construction details; these should be clarified for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will revise the manuscript to improve clarity on the theoretical claims and strengthen the experimental validation of the mechanistic components.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim that cross-modal binding disruption 'enforces a spurious correlation between the noise and the training target with theoretical guarantees' is load-bearing for the robustness argument across nine architectures, yet no derivation, equation, or section is referenced showing how the guarantee follows from the perturbation strategy; if attention heads route around the noise differently than assumed, the downstream degradation fails.

    Authors: We appreciate the referee identifying this gap in referencing. The derivation of the spurious correlation is formalized in Section 3.3, where we model the attention routing under the perturbation and prove that the noise-target correlation is enforced via a bound on the attention weights (Equation 7). To make this explicit and address potential routing variations, we will revise the abstract to directly cite Section 3.3 and Equation 7, expand the discussion in Section 3.3 on the assumption robustness, and add a short proof sketch in the main text. revision: yes

  2. Referee: [Evaluation] Evaluation section: Results on nine models and six datasets are presented without ablation studies isolating the binding term, error analysis, or mechanistic verification that the optimization shortcut (rather than other factors) drives the observed protection; this leaves the central claim of a 'mechanistic advantage' unconfirmed by the reported experiments alone.

    Authors: We agree that additional ablations and mechanistic checks would strengthen the evaluation. In the revision, we will add ablation studies comparing the full MMGuard against variants without the binding disruption term across the nine models. We will also include error analysis (e.g., per-dataset variance and failure cases) and mechanistic verification via attention map visualizations and training loss curves to isolate the optimization shortcut's contribution. These will be placed in a new subsection of the evaluation. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper introduces MMGuard via perturbation injection to create optimization shortcuts and cross-modal binding disruption to enforce spurious correlations, with claims of theoretical guarantees. No equations, fitted parameters renamed as predictions, or self-citations appear in the abstract or described text that reduce these mechanisms back to their own inputs by construction. The approach applies standard attention and loss concepts to LVLMs without self-definitional loops or ansatzes imported from prior author work; evaluation across models and datasets provides external empirical grounding rather than internal reduction. The derivation remains self-contained against the stated benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review limits visibility into exact parameters; the core premise rests on domain assumptions about LVLM training dynamics rather than new invented entities or fitted constants.

axioms (1)
  • domain assumption LVLM learning dynamics permit creation of optimization shortcuts via imperceptible perturbations that cause overfitting to noise
    Invoked to justify why minimizing training loss on perturbed data degrades performance when perturbation is absent at inference.

pith-pipeline@v0.9.0 · 5549 in / 1248 out tokens · 31015 ms · 2026-05-15T02:35:56.750302+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

75 extracted references · 75 canonical work pages · 5 internal anchors

  1. [1]

    IEEE Transactions on Multimedia , year=

    Imperceptible protection against style imitation from diffusion models , author=. IEEE Transactions on Multimedia , year=

  2. [2]

    Bai, Shuai and Chen, Keqin and Liu, Xuejing and Wang, Jialin and Ge, Wenbin and Song, Sibo and Dang, Kai and Wang, Peng and Wang, Shijie and Tang, Jun and others , journal=

  3. [3]

    Qwen3-VL Technical Report

    Qwen3-vl technical report , author=. arXiv preprint arXiv:2511.21631 , year=

  4. [4]

    2021 IEEE Symposium on Security and Privacy , pages=

    Machine unlearning , author=. 2021 IEEE Symposium on Security and Privacy , pages=

  5. [5]

    34th USENIX Security Symposium (USENIX Security 25) , pages=

    \ LightShed \ : Defeating Perturbation-based Image Copyright Protections , author=. 34th USENIX Security Symposium (USENIX Security 25) , pages=

  6. [6]

    30th USENIX Security Symposium , pages=

    Extracting training data from large language models , author=. 30th USENIX Security Symposium , pages=

  7. [7]

    2022 IEEE symposium on security and privacy (SP) , pages=

    Membership inference attacks from first principles , author=. 2022 IEEE symposium on security and privacy (SP) , pages=. 2022 , organization=

  8. [8]

    International Conference on Learning Representations , year=

    Poisoning and backdooring contrastive learning , author=. International Conference on Learning Representations , year=

  9. [9]

    Advances in Neural Information Processing Systems , volume=

    Are we on the right way for evaluating large vision-language models? , author=. Advances in Neural Information Processing Systems , volume=

  10. [10]

    Journal of King Saud University Computer and Information Sciences , volume=

    A survey on privacy risks and protection in large language models , author=. Journal of King Saud University Computer and Information Sciences , volume=. 2025 , publisher=

  11. [11]

    Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

    Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities , author=. arXiv preprint arXiv:2507.06261 , year=

  12. [12]

    ACM Computing Surveys , volume=

    Security and privacy challenges of large language models: A survey , author=. ACM Computing Surveys , volume=. 2025 , publisher=

  13. [13]

    Nature , volume=

    Scalable watermarking for identifying large language model outputs , author=. Nature , volume=

  14. [14]

    IEEE Conference on Secure and Trustworthy Machine Learning , year=

    The devil's advocate: Shattering the illusion of unexploitable data using diffusion models , author=. IEEE Conference on Secure and Trustworthy Machine Learning , year=

  15. [15]

    Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

    The stable signature: Rooting watermarks in latent diffusion models , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

  16. [16]

    Advances in Neural Information Processing Systems , volume=

    Adversarial examples make strong poisons , author=. Advances in Neural Information Processing Systems , volume=

  17. [17]

    International Conference on Learning Representations , year=

    Robust unlearnable examples: Protecting data privacy against adversarial learning , author=. International Conference on Learning Representations , year=

  18. [18]

    Gemma 3 Technical Report

    Gemma 3 technical report , author=. arXiv preprint arXiv:2503.19786 , year=

  19. [19]

    GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

    Glm-4.5 v and glm-4.1 v-thinking: Towards versatile multimodal reasoning with scalable reinforcement learning , author=. arXiv preprint arXiv:2507.01006 , year=

  20. [20]

    Gokul, Vignesh and Dubnov, Shlomo , journal=

  21. [21]

    The Llama 3 Herd of Models

    The Llama 3 herd of models , author=. arXiv preprint arXiv:2407.21783 , year=

  22. [22]

    Hayes, Jamie and Melis, Luca and Danezis, George and De Cristofaro, Emiliano , journal=

  23. [23]

    Journal of Machine Learning Research , volume=

    Foundation models and fair use , author=. Journal of Machine Learning Research , volume=

  24. [24]

    Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , year=

    Dissecting fine-tuning unlearning in large language models , author=. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , year=

  25. [25]

    International Conference on Learning Representations , year=

    Unlearnable examples: Making personal data unexploitable , author=. International Conference on Learning Representations , year=

  26. [26]

    International Conference on Machine Learning , pages=

    A watermark for large language models , author=. International Conference on Machine Learning , pages=

  27. [27]

    arXiv preprint arXiv:2411.08506 , year=

    Towards operationalizing right to data protection , author=. arXiv preprint arXiv:2411.08506 , year=

  28. [28]

    Scientific data , volume=

    A dataset of clinically generated visual questions and answers about radiology images , author=. Scientific data , volume=. 2018 , publisher=

  29. [29]

    Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing (TrustNLP 2023) , pages=

    Make text unlearnable: Exploiting effective patterns to protect personal data , author=. Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing (TrustNLP 2023) , pages=

  30. [30]

    IEEE International Conference on Acoustics, Speech and Signal Processing , year=

    Making audio data unlearnable , author=. IEEE International Conference on Acoustics, Speech and Signal Processing , year=

  31. [31]

    arXiv preprint arXiv:2303.02568 , year=

    Unlearnable graph: Protecting graphs from unauthorized exploitation , author=. arXiv preprint arXiv:2303.02568 , year=

  32. [32]

    Advances in Neural Information Processing Systems , volume=

    Visual instruction tuning , author=. Advances in Neural Information Processing Systems , volume=

  33. [33]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    Improved baselines with visual instruction tuning , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  34. [34]

    Proceedings of the 32nd ACM International Conference on Multimedia , pages=

    Multimodal unlearnable examples: Protecting data against multimodal contrastive learning , author=. Proceedings of the 32nd ACM International Conference on Multimedia , pages=

  35. [35]

    Liu, Yixin and Fan, Chenrui and Dai, Yutong and Chen, Xun and Zhou, Pan and Sun, Lichao , booktitle=

  36. [36]

    Nature Machine Intelligence , volume=

    Rethinking machine unlearning for large language models , author=. Nature Machine Intelligence , volume=. 2025 , publisher=

  37. [37]

    arXiv preprint arXiv:2407.07403 , year=

    A survey of attacks on large vision-language models: Resources, advances, and future trends , author=. arXiv preprint arXiv:2407.07403 , year=

  38. [38]

    Advances in Neural Information Processing Systems , volume=

    Unseg: One universal unlearnable example generator is enough against all image segmentation , author=. Advances in Neural Information Processing Systems , volume=

  39. [39]

    Proceedings of the 1st ACM Workshop on Large AI Systems and Models with Privacy and Safety Analysis , pages=

    Mitigating unauthorized speech synthesis for voice protection , author=. Proceedings of the 1st ACM Workshop on Large AI Systems and Models with Privacy and Safety Analysis , pages=

  40. [40]

    The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

    BridgePure: Limited Protection Leakage Can Break Black-Box Data Protection , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

  41. [41]

    The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

    StyleGuard: Preventing Text-to-Image-Model-based Style Mimicry Attacks by Style Perturbations , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

  42. [42]

    Advances in neural information processing systems , volume=

    Learn to explain: Multimodal reasoning via thought chains for science question answering , author=. Advances in neural information processing systems , volume=

  43. [43]

    International Conference on Learning Representations , year=

    Towards deep learning models resistant to adversarial attacks , author=. International Conference on Learning Representations , year=

  44. [44]

    Proceedings of the IEEE/CVF winter conference on applications of computer vision , pages=

    Docvqa: A dataset for vqa on document images , author=. Proceedings of the IEEE/CVF winter conference on applications of computer vision , pages=

  45. [45]

    and Grant, Nico , howpublished=

    Metz, Cade and Kang, Cecilia and Frenkel, Sheera and Thompson, Stuart A. and Grant, Nico , howpublished=. How tech giants cut corners to harvest data for. 2024 , note=

  46. [46]

    arXiv e-prints , pages=

    Unlearnable Examples Give a False Sense of Data Privacy: Understanding and Relearning , author=. arXiv e-prints , pages=

  47. [47]

    International Conference on Machine Learning , pages=

    Learning transferable visual models from natural language supervision , author=. International Conference on Machine Learning , pages=

  48. [48]

    2024 , howpublished=

  49. [49]

    International Conference on Learning Representations , year=

    Transferable unlearnable examples , author=. International Conference on Learning Representations , year=

  50. [50]

    Advances in Neural Information Processing Systems , volume=

    Autoregressive perturbations for data poisoning , author=. Advances in Neural Information Processing Systems , volume=

  51. [51]

    Schuhmann, Christoph and Beaumont, Romain and Vencu, Richard and Gordon, Cade and Wightman, Ross and Cherti, Mehdi and Coombes, Theo and Katta, Aarush and Mullis, Clayton and Wortsman, Mitchell and others , booktitle=

  52. [52]

    Shan, Shawn and Cryan, Jenna and Wenger, Emily and Zheng, Haitao and Hanocka, Rana and Zhao, Ben Y , booktitle=

  53. [53]

    Shan, Shawn and Ding, Wenxin and Passananti, Josephine and Wu, Stanley and Zheng, Haitao and Zhao, Ben Y , booktitle=

  54. [54]

    International Conference on Learning Representations , year=

    Jailbreak in pieces: Compositional adversarial attacks on multi-modal language models , author=. International Conference on Learning Representations , year=

  55. [55]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    Towards vqa models that can read , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  56. [56]

    Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security , pages=

    Antifake: Using adversarial audio to prevent unauthorized speech synthesis , author=. Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security , pages=

  57. [57]

    Singh, Aaditya and Fry, Adam and Perelman, Adam and Tart, Adam and Ganesh, Adi and El-Kishky, Ahmed and McLaughlin, Aidan and Low, Aiden and Ostrow, AJ and Ananthram, Akhila and others , journal=

  58. [58]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Stable unlearnable example: Enhancing the robustness of unlearnable examples via stable error-minimizing noise , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  59. [59]

    Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

    Anti-dreambooth: Protecting users from personalized text-to-image synthesis , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

  60. [60]

    Proceedings of the 32nd ACM International Conference on Multimedia , year=

    White-box multimodal jailbreaks against large vision-language models , author=. Proceedings of the 32nd ACM International Conference on Multimedia , year=

  61. [61]

    arXiv preprint arXiv:2505.16446 , year=

    Implicit jailbreak attacks via cross-modal information concealment on vision-language models , author=. arXiv preprint arXiv:2505.16446 , year=

  62. [62]

    Wang, Weiyun and Gao, Zhangwei and Gu, Lixin and Pu, Hengjun and Cui, Long and Wei, Xingguang and Liu, Zhaoyang and Jing, Linglin and Ye, Shenglong and Shao, Jie and others , journal=

  63. [63]

    arXiv preprint arXiv:2411.17911 , year=

    Passive deepfake detection across multi-modalities: A comprehensive survey , author=. arXiv preprint arXiv:2411.17911 , year=

  64. [64]

    arXiv preprint arXiv:2603.04731 , year=

    When Priors Backfire: On the Vulnerability of Unlearnable Examples to Pretraining , author=. arXiv preprint arXiv:2603.04731 , year=

  65. [65]

    Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing , pages=

    OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models , author=. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing , pages=

  66. [66]

    On protecting the data privacy of large language models (

    Yan, Biwei and Li, Kun and Xu, Minghui and Dong, Yueyan and Zhang, Yue and Ren, Zhaochun and Cheng, Xiuzhen , journal=. On protecting the data privacy of large language models (

  67. [67]

    Nature Communications , volume=

    Efficient GPT-4V level multimodal large language model for deployment on edge devices , author=. Nature Communications , volume=. 2025 , publisher=

  68. [68]

    Class Action Complaint:

    Millette, David , howpublished =. Class Action Complaint:. 2024 , month = aug, url =

  69. [69]

    The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

    Versatile Transferable Unlearnable Example Generator , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

  70. [70]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

    Unlearnable clusters: Towards label-agnostic unlearnable examples , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

  71. [71]

    arXiv preprint arXiv:2306.01902 , year=

    Unlearnable examples for diffusion models: Protect data from unauthorized exploitation , author=. arXiv preprint arXiv:2306.01902 , year=

  72. [72]

    Zhu, Jiren and Kaplan, Russell and Johnson, Justin and Fei-Fei, Li , booktitle=

  73. [73]

    Zhu, Jinguo and Wang, Weiyun and Chen, Zhe and Liu, Zhaoyang and Ye, Shenglong and Gu, Lixin and Tian, Hao and Duan, Yuchen and Su, Weijie and Shao, Jie and others , journal=

  74. [74]

    Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) , pages=

    Hotflip: White-box adversarial examples for text classification , author=. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) , pages=

  75. [75]

    Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=

    Data contamination can cross language barriers , author=. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=