pith. machine review for the scientific record. sign in

arxiv: 2605.01761 · v1 · submitted 2026-05-03 · 💻 cs.CV

Recognition: 3 theorem links

· Lean Theorem

TrajShield: Trajectory-Level Safety Mediation for Defending Text-to-Video Models Against Jailbreak Attacks

Authors on Pith no claims yet

Pith reviewed 2026-05-08 19:23 UTC · model grok-4.3

classification 💻 cs.CV
keywords text-to-video safetyjailbreak defensetrajectory simulationcausal interventiontemporally emergent riskinference-time defenseprompt rewritingvideo generation safety
0
0 comments X

The pith

TrajShield defends text-to-video models by simulating prompt trajectories to locate and neutralize safety risks with minimal rewrites.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces TrajShield as a training-free method to protect text-to-video generators from producing unsafe content. It treats safety as a problem of intervening in the prompt's implied temporal sequence rather than editing surface words. Existing defenses borrowed from image models leave gaps because they miss how video generators can turn safe starting prompts into harmful scenes through narrative buildup. If the approach works, it would let users run T2V systems on a wider range of prompts without frequent unsafe outputs or the need for model retraining.

Core claim

TrajShield reformulates T2V safety as a causal intervention in a temporally structured semantic space. It handles explicit unsafe prompts, jailbreak attacks, and temporally emergent risks in a unified manner by simulating the implied trajectory of a prompt, localizing the causal origin of potential risk, and applying a minimally invasive rewrite that neutralizes the risk while preserving safety-irrelevant semantics.

What carries the argument

The trajectory simulation that traces a prompt's implied sequence of events to identify where harmful content would first arise, followed by a targeted rewrite at that point.

If this is right

  • Unified defense against explicit unsafe prompts, rephrased jailbreaks, and risks that only appear after the video generator adds temporal coherence.
  • State-of-the-art results on T2VSafetyBench across 14 safety categories and multiple video backends.
  • Average attack success rate reduction of 52.44 percent while keeping high semantic fidelity to the original prompt.
  • No need for retraining or access to model weights, allowing deployment at inference time on existing systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same trajectory-localization idea could be tested on other time-based generators such as text-to-audio or long image sequences.
  • Safety filters might move from one-time prompt checks to continuous monitoring of how the generation unfolds step by step.
  • If the minimal-rewrite step proves robust, it could reduce the need for heavy post-generation content filters in production video tools.

Load-bearing premise

That the simulation of a prompt's implied trajectory can reliably find the source of safety problems and that a small rewrite at that source will remove the risk without changing the rest of the intended video content.

What would settle it

A set of new jailbreak prompts where TrajShield's rewrites either fail to lower attack success rates below baseline defenses or produce videos whose content differs substantially from what the original prompt requested.

Figures

Figures reproduced from arXiv: 2605.01761 by Jiaye Lin, Nizhang Li, Quanchen Zou, Wenxin Zhang, Xiangzheng Zhang, Yangchen Zeng, Zonghao Ying.

Figure 1
Figure 1. Figure 1: Example of the defense effect of TRAJSHIELD. lexical safety signals: they match against blacklists of unsafe keywords, train classifiers on surface-level textual features, or apply generic large language model (LLM) rewriting to sanitize the prompt. These methods were originally developed for text-to-image (T2I) safety and have been carried over to the video setting with minimal adaptation [10]–[12]. This … view at source ↗
Figure 2
Figure 2. Figure 2: The overall pipeline of our proposed TRAJSHIELD, which consists of Motion-Semantic Decoupling, Temporal Risk Propagation Graph, and Temporally￾Consistent Safe Rewriting to effectively defend T2V models against jailbreak attacks. c) Relation to static risk.: MSD also facilitates early static risk detection. The structured fields Esub and Eenv surface explicitly unsafe entities (e.g., weapons, NSFW descripto… view at source ↗
Figure 3
Figure 3. Figure 3: Qualitative comparison of video generation before and after applying T view at source ↗
Figure 4
Figure 4. Figure 4: Comparison of the defense performance of T view at source ↗
Figure 5
Figure 5. Figure 5: A visual example from SafeWatch. We present the real video, the video view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of ASR on the SafeWatch dataset before and after view at source ↗
Figure 8
Figure 8. Figure 8: Ablation study of the three key modules in T view at source ↗
read the original abstract

Text-to-Video (T2V) models have demonstrated remarkable capability in generating temporally coherent videos from natural language prompts, yet they also risk producing unsafe content such as violence or explicit material. Existing prompt-level defenses are largely inherited from text-to-image safety and operate on the lexical surface of the input, making them vulnerable to jailbreak attacks that disguise harmful intent through rephrasing or adversarial prompting. Moreover, T2V generation introduces a distinctive challenge overlooked by prior work: temporally emergent risk, where a seemingly benign prompt leads to unsafe content through the generator's temporal extrapolation toward narrative coherence. We propose \method{}, a training-free, inference-time defense framework that reformulates T2V safety as a causal intervention in a temporally structured semantic space. TrajShield handles explicit unsafe prompts, jailbreak attacks, and temporally emergent risks in a unified manner by simulating the implied trajectory of a prompt, localizing the causal origin of potential risk, and applying a minimally invasive rewrite that neutralizes the risk while preserving safety-irrelevant semantics. Experiments on T2VSafetyBench across 14 safety categories and multiple T2V backends demonstrate that TrajShield achieves state-of-the-art defenseive performance while maintaining high semantic fidelity, substantially outperforming existing defenses, with an average ASR reduction of 52.44\%.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents TrajShield, a training-free, inference-time defense framework for text-to-video (T2V) models. It reformulates safety as causal intervention by simulating the implied trajectory of a prompt to localize the causal origin of potential risk (explicit, jailbreak, or temporally emergent), then applies a minimally invasive rewrite to neutralize the risk while preserving safety-irrelevant semantics. Experiments on T2VSafetyBench across 14 safety categories and multiple T2V backends report state-of-the-art defensive performance with an average ASR reduction of 52.44% while maintaining high semantic fidelity.

Significance. If the results hold, this would be a meaningful contribution to T2V safety by addressing temporally emergent risks arising from the generator's coherence-seeking behavior, a challenge not handled by prompt-level defenses inherited from text-to-image models. The training-free nature and unified treatment of risk types are practical strengths; the reported outperformance on a broad benchmark could inform future work on multimodal generative safety.

major comments (2)
  1. [Method (trajectory simulation and causal intervention)] The central claim depends on the trajectory simulation reliably localizing risk origins within the T2V model's actual temporal dynamics. The manuscript provides no direct validation (e.g., alignment metrics between simulated trajectories and generated video content evolution or attention patterns) to confirm that the external simulation corresponds to the model's internal extrapolation, which is load-bearing for neutralizing real emergent hazards rather than spurious ones.
  2. [§4 (Experiments)] §4 (Experiments): The SOTA claim and 52.44% average ASR reduction are presented without per-category or per-backend breakdowns, variance measures, or statistical significance tests against all baselines. This weakens the ability to evaluate robustness of the outperformance across the 14 categories and multiple backends.
minor comments (2)
  1. [Abstract] Abstract: 'defenseive' is a typo and should read 'defensive'.
  2. [Throughout] Ensure consistent definition of acronyms (e.g., ASR) on first use and clear notation for trajectory components throughout.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and commit to revisions that strengthen the manuscript without misrepresenting our current results.

read point-by-point responses
  1. Referee: [Method (trajectory simulation and causal intervention)] The central claim depends on the trajectory simulation reliably localizing risk origins within the T2V model's actual temporal dynamics. The manuscript provides no direct validation (e.g., alignment metrics between simulated trajectories and generated video content evolution or attention patterns) to confirm that the external simulation corresponds to the model's internal extrapolation, which is load-bearing for neutralizing real emergent hazards rather than spurious ones.

    Authors: We agree that direct empirical alignment between the external trajectory simulation and the model's internal temporal dynamics would provide stronger mechanistic support. Our simulation is constructed as a proxy for the prompt's implied semantic evolution (using causal graph modeling over key entities and events), chosen for its training-free applicability across backends. While indirect validation exists via consistent ASR reductions on temporally emergent risks, we acknowledge the gap. In revision we will add (1) a detailed derivation of the simulation procedure with explicit assumptions, (2) new quantitative comparisons of simulated trajectories against attention rollout and frame-wise semantic drift in generated videos on a subset of prompts, and (3) a limitations paragraph noting that perfect internal matching is not claimed. revision: yes

  2. Referee: [§4 (Experiments)] The SOTA claim and 52.44% average ASR reduction are presented without per-category or per-backend breakdowns, variance measures, or statistical significance tests against all baselines. This weakens the ability to evaluate robustness of the outperformance across the 14 categories and multiple backends.

    Authors: The referee is correct that the current presentation emphasizes the aggregate 52.44% figure. The full experimental section already contains per-backend results, but per-category breakdowns, standard deviations across multiple seeds, and formal significance testing (e.g., Wilcoxon signed-rank or paired t-tests with correction) are not reported in tables. We will revise §4 to include: expanded tables with all 14 categories and all backends, variance statistics, and statistical tests against every baseline. This will allow readers to assess robustness directly. revision: yes

Circularity Check

0 steps flagged

No circularity: TrajShield presents an independent causal-intervention framework

full rationale

The paper's core derivation introduces a training-free inference-time method that simulates an implied prompt trajectory, localizes risk origin via causal intervention in semantic space, and applies a minimally invasive rewrite. No equations, parameters, or steps are shown to reduce by construction to fitted inputs, self-definitions, or self-citation chains. The unified handling of explicit, jailbreak, and emergent risks is framed as a novel reformulation rather than a renaming or ansatz imported from prior self-work. The experimental claims rest on external benchmarks (T2VSafetyBench) and comparisons to existing defenses, keeping the logic self-contained against external validation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are explicitly mentioned in the abstract. The framework is described as training-free, relying on simulation of prompt trajectories within existing T2V models.

pith-pipeline@v0.9.0 · 5558 in / 1156 out tokens · 91908 ms · 2026-05-08T19:23:49.741247+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

50 extracted references · 15 canonical work pages · 5 internal anchors

  1. [1]

    Customvideo: Customizing text-to-video generation with multiple subjects.IEEE Transactions on Multimedia, 2026

    Zhao Wang, Aoxue Li, Lingting Zhu, Yong Guo, Qi Dou, and Zhenguo Li. Customvideo: Customizing text-to-video generation with multiple subjects.IEEE Transactions on Multimedia, 2026

  2. [2]

    Ta2v: Text-audio guided video generation.IEEE Transactions on Multimedia, 26:7250–7264, 2024

    Minglu Zhao, Wenmin Wang, Tongbao Chen, Rui Zhang, and Ruochen Li. Ta2v: Text-audio guided video generation.IEEE Transactions on Multimedia, 26:7250–7264, 2024

  3. [3]

    OpenAI. Sora 2. https://openai.com/sora/, 2024. Turn your ideas into videos with hyperreal motion and sound

  4. [4]

    Kling ai: Next-gen ai video & ai image generator

    Kuaishou Technology. Kling ai: Next-gen ai video & ai image generator. Online, 2026. A generative AI platform that creates high-quality videos and images from text or image prompts with native audio and cinematic features

  5. [5]

    Seedance 1.5 pro: A native audio-visual joint generation foundation model.arXiv preprint arXiv:2512.13507, 2025

    Team Seedance, Heyi Chen, Siyan Chen, Xin Chen, Yanfei Chen, Ying Chen, Zhuo Chen, Feng Cheng, Tianheng Cheng, Xinqi Cheng, et al. Seedance 1.5 pro: A native audio-visual joint generation foundation model.arXiv preprint arXiv:2512.13507, 2025

  6. [6]

    T2vsafetybench: Evaluating the safety of text-to-video generative models.Advances in Neural Information Processing Systems, 37:63858– 63872, 2024

    Yibo Miao, Yifan Zhu, Lijia Yu, Jun Zhu, Xiao-Shan Gao, and Yinpeng Dong. T2vsafetybench: Evaluating the safety of text-to-video generative models.Advances in Neural Information Processing Systems, 37:63858– 63872, 2024

  7. [7]

    Safewatch: An efficient safety-policy following video guardrail model with trans- parent explanations.arXiv preprint arXiv:2412.06878, 2024

    Zhaorun Chen, Francesco Pinto, Minzhou Pan, and Bo Li. Safewatch: An efficient safety-policy following video guardrail model with trans- parent explanations.arXiv preprint arXiv:2412.06878, 2024

  8. [8]

    T2vshield: Model- agnostic jailbreak defense for text-to-video models.International Journal of Computer Vision, 134(4):144, 2026

    Siyuan Liang, Jiayang Liu, Jiecheng Zhai, Tianmeng Fang, Rongcheng Tu, Aishan Liu, Xiaochun Cao, and Dacheng Tao. T2vshield: Model- agnostic jailbreak defense for text-to-video models.International Journal of Computer Vision, 134(4):144, 2026

  9. [9]

    Detoxify

    Laura Hanu and Unitary team. Detoxify. GitHub. https://github.com/unitaryai/detoxify, 2020. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 10

  10. [10]

    Multi-sentence complementarily generation for text-to-image synthesis.IEEE Transactions on Multimedia, 26:8323– 8332, 2023

    Liang Zhao, Pingda Huang, Tengtuo Chen, Chunjiang Fu, Qinghao Hu, and Yangqianhui Zhang. Multi-sentence complementarily generation for text-to-image synthesis.IEEE Transactions on Multimedia, 26:8323– 8332, 2023

  11. [11]

    Recurrent affine transformation for text-to-image synthesis.IEEE Transactions on Multimedia, 26:462–473, 2023

    Senmao Ye, Huan Wang, Mingkui Tan, and Fei Liu. Recurrent affine transformation for text-to-image synthesis.IEEE Transactions on Multimedia, 26:462–473, 2023

  12. [12]

    Vision-language matching for text-to-image synthesis via generative adversarial networks

    Qingrong Cheng, Keyu Wen, and Xiaodong Gu. Vision-language matching for text-to-image synthesis via generative adversarial networks. IEEE Transactions on Multimedia, 25:7062–7075, 2022

  13. [13]

    T2vbench: Benchmarking temporal dynamics for text-to-video generation

    Pengliang Ji, Chuyang Xiao, Huilin Tai, and Mingxiao Huo. T2vbench: Benchmarking temporal dynamics for text-to-video generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5325–5335, 2024

  14. [14]

    Style injection in diffusion: A training-free approach for adapting large-scale diffusion models for style transfer

    Jiwoo Chung, Sangeek Hyun, and Jae-Pil Heo. Style injection in diffusion: A training-free approach for adapting large-scale diffusion models for style transfer. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8795–8805, 2024

  15. [15]

    Hierarchicalprune: Position-aware compression for large-scale diffusion models

    Young D Kwon, Rui Li, Sijia Li, Da Li, Sourav Bhattacharya, and Stylianos I Venieris. Hierarchicalprune: Position-aware compression for large-scale diffusion models. InProceedings of the AAAI Conference on Artificial Intelligence, volume 40, pages 22716–22724, 2026

  16. [16]

    Attention is all you need.Advances in neural information processing systems, 30, 2017

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017

  17. [17]

    High-resolution image synthesis with latent diffusion models

    Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj¨orn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022

  18. [18]

    Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

    Andreas Blattmann, Tim Dockhorn, Sumith Kulal, Daniel Mendelevitch, Maciej Kilian, Dominik Lorenz, Yam Levi, Zion English, Vikram V oleti, Adam Letts, et al. Stable video diffusion: Scaling latent video diffusion models to large datasets.arXiv preprint arXiv:2311.15127, 2023

  19. [19]

    ModelScope Text-to-Video Technical Report

    Jiuniu Wang, Hangjie Yuan, Dayou Chen, Yingya Zhang, Xiang Wang, and Shiwei Zhang. Modelscope text-to-video technical report.arXiv preprint arXiv:2308.06571, 2023

  20. [20]

    Video diffusion models.Advances in neural information processing systems, 35:8633–8646, 2022

    Jonathan Ho, Tim Salimans, Alexey Gritsenko, William Chan, Moham- mad Norouzi, and David J Fleet. Video diffusion models.Advances in neural information processing systems, 35:8633–8646, 2022

  21. [21]

    Make-A-Video: Text-to-Video Generation without Text-Video Data

    Uriel Singer, Adam Polyak, Thomas Hayes, Xi Yin, Jie An, Songyang Zhang, Qiyuan Hu, Harry Yang, Oron Ashual, Oran Gafni, et al. Make- a-video: Text-to-video generation without text-video data.arXiv preprint arXiv:2209.14792, 2022

  22. [22]

    Video generation models as world simulators.OpenAI Blog, 1(8):1, 2024

    Tim Brooks, Bill Peebles, Connor Holmes, Will DePue, Yufei Guo, Leo Jing, David Schnurr, Joe Taylor, Troy Luhman, Eric Luhman, et al. Video generation models as world simulators.OpenAI Blog, 1(8):1, 2024

  23. [23]

    Seedance 1.0: Exploring the Boundaries of Video Generation Models

    Yu Gao, Haoyuan Guo, Tuyen Hoang, Weilin Huang, Lu Jiang, Fangyuan Kong, Huixia Li, Jiashi Li, Liang Li, Xiaojie Li, et al. Seedance 1.0: Exploring the boundaries of video generation models. arXiv preprint arXiv:2506.09113, 2025

  24. [24]

    Veo: State-of-the-art video generation model

    Google DeepMind. Veo: State-of-the-art video generation model. Model Card / Documentation, 2025. Available online: https://deepmind.google/ models/veo/

  25. [25]

    Trce: Towards reliable malicious concept erasure in text-to-image diffusion models

    Ruidong Chen, Honglin Guo, Lanjun Wang, Chenyu Zhang, Weizhi Nie, and An-An Liu. Trce: Towards reliable malicious concept erasure in text-to-image diffusion models. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 18927–18936, 2025

  26. [26]

    Mace: Mass concept erasure in diffusion models

    Shilin Lu, Zilan Wang, Leyang Li, Yanzhu Liu, and Adams Wai- Kin Kong. Mace: Mass concept erasure in diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6430–6440, 2024

  27. [27]

    Personalized safety alignment for text-to-image diffusion models.arXiv preprint arXiv:2508.01151, 2025

    Yu Lei, Jinbin Bai, Qingyu Shi, Aosong Feng, Hongcheng Gao, Xiao Zhang, and Rex Ying. Personalized safety alignment for text-to-image diffusion models.arXiv preprint arXiv:2508.01151, 2025

  28. [28]

    Detect-and-guide: Self-regulation of diffusion models for safe text-to-image generation via guideline token optimization

    Feifei Li, Mi Zhang, Yiming Sun, and Min Yang. Detect-and-guide: Self-regulation of diffusion models for safe text-to-image generation via guideline token optimization. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 13252–13262, 2025

  29. [29]

    Safeguider: Robust and practical content safety control for text-to-image models

    Peigui Qi, Kunsheng Tang, Wenbo Zhou, Weiming Zhang, Nenghai Yu, Tianwei Zhang, Qing Guo, and Jie Zhang. Safeguider: Robust and practical content safety control for text-to-image models. In Proceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security, pages 2818–2832, 2025

  30. [30]

    Guardt2i: Defending text-to-image models from adversarial prompts

    Yijun Yang, Ruiyuan Gao, Xiao Yang, Jianyuan Zhong, and Qiang Xu. Guardt2i: Defending text-to-image models from adversarial prompts. Advances in neural information processing systems, 37:76380–76403, 2024

  31. [31]

    arXiv preprint arXiv:2505.06679 (2025)

    Jiayang Liu, Siyuan Liang, Shiqian Zhao, Rongcheng Tu, Wenbo Zhou, Aishan Liu, Dacheng Tao, and Siew Kei Lam. T2v-optjail: Discrete prompt optimization for text-to-video jailbreak attacks.arXiv preprint arXiv:2505.06679, 2025

  32. [32]

    Jailbreaking on text-to-video models via scene splitting strategy

    Wonjun Lee, Haon Park, Doehyeon Lee, Bumsub Ham, and Suhyun Kim. Jailbreaking on text-to-video models via scene splitting strategy. arXiv preprint arXiv:2509.22292, 2025

  33. [33]

    Veil: Jailbreaking text-to-video models via visual exploitation from implicit language.arXiv preprint arXiv:2511.13127, 2025

    Zonghao Ying, Moyang Chen, Nizhang Li, Zhiqiang Wang, Wenxin Zhang, Quanchen Zou, Zonglei Jing, Aishan Liu, and Xianglong Liu. Veil: Jailbreaking text-to-video models via visual exploitation from implicit language.arXiv preprint arXiv:2511.13127, 2025

  34. [34]

    Collecting highly parallel data for paraphrase evaluation

    David Chen and William B Dolan. Collecting highly parallel data for paraphrase evaluation. InProceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, pages 190–200, 2011

  35. [35]

    Learning transferable visual models from natural language supervision

    Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. InInternational conference on machine learning, pages 8748–8763. PmLR, 2021

  36. [36]

    Unsupervised temporal consistency metric for video segmentation in highly-automated driving

    Serin Varghese, Yasin Bayzidi, Andreas Bar, Nikhil Kapoor, Sounak Lahiri, Jan David Schneider, Nico M Schmidt, Peter Schlicht, Fabian Huger, and Tim Fingscheidt. Unsupervised temporal consistency metric for video segmentation in highly-automated driving. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 3...

  37. [37]

    Ring-a-bell! how reliable are concept removal methods for diffusion models?arXiv preprint arXiv:2310.10012, 2023

    Yu-Lin Tsai, Chia-Yi Hsu, Chulin Xie, Chih-Hsun Lin, Jia-You Chen, Bo Li, Pin-Yu Chen, Chia-Mu Yu, and Chun-Ying Huang. Ring-a-bell! how reliable are concept removal methods for diffusion models?arXiv preprint arXiv:2310.10012, 2023

  38. [38]

    Divide-and-conquer attack: Harnessing the power of llm to bypass the censorship of text-to-image generation model,

    Yimo Deng and Huangxun Chen. Divide-and-conquer attack: Harnessing the power of llm to bypass the censorship of text-to-image generation model.arXiv preprint arXiv:2312.07130, 1(2):6, 2023

  39. [39]

    Sneakyprompt: Jailbreaking text-to-image generative models

    Yuchen Yang, Bo Hui, Haolin Yuan, Neil Gong, and Yinzhi Cao. Sneakyprompt: Jailbreaking text-to-image generative models. In2024 IEEE symposium on security and privacy (SP), pages 897–912. IEEE, 2024

  40. [40]

    Pixverse platform: Ai video generation and creative studio

    PixVerse. Pixverse platform: Ai video generation and creative studio. Online, 2026. A comprehensive AI video generation platform for text- to-video, image-to-video, and multimodal creative workflows, offering high-fidelity output and integrated tools for creators and developers

  41. [41]

    Hailuo 2.3 model: Pro-tier ai video generation (text-to-video & image-to-video)

    Hailuo AI. Hailuo 2.3 model: Pro-tier ai video generation (text-to-video & image-to-video). Online, 2025

  42. [42]

    Veo 3.1 generate preview — gemini api models

    Google AI. Veo 3.1 generate preview — gemini api models. https://ai. google.dev/gemini-api/docs/models/veo-3.1-generate-preview, February

  43. [43]

    Last updated: 2026-02-18; Accessed: 2026-04-29

  44. [44]

    Sora 2 is here

    OpenAI. Sora 2 is here. https://openai.com/index/sora-2/, September 30

  45. [45]

    Accessed: 2026-04-29

  46. [46]

    DeepSeek-V3 Technical Report

    Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, et al. Deepseek-v3 technical report.arXiv preprint arXiv:2412.19437, 2024

  47. [47]

    GPT-4o System Card

    Aaron Hurst, Adam Lerer, Adam P Goucher, Adam Perelman, Aditya Ramesh, Aidan Clark, AJ Ostrow, Akila Welihinda, Alan Hayes, Alec Radford, et al. Gpt-4o system card.arXiv preprint arXiv:2410.21276, 2024

  48. [48]

    Grok 4 fast: Pushing the frontier of cost-efficient intelligence

    xAI. Grok 4 fast: Pushing the frontier of cost-efficient intelligence. https://x.ai/news/grok-4-fast, September 2025. Accessed: 2026-04-29

  49. [49]

    Gemini flash: Frontier intelligence at speed

    Google DeepMind. Gemini flash: Frontier intelligence at speed. https: //deepmind.google/models/gemini/flash/, 2026. Accessed: 2026-04-29

  50. [50]

    Kimi K2: Open Agentic Intelligence

    Kimi Team, Yifan Bai, Yiping Bao, Y Charles, Cheng Chen, Guanduo Chen, Haiting Chen, Huarong Chen, Jiahao Chen, Ningxin Chen, et al. Kimi k2: Open agentic intelligence.arXiv preprint arXiv:2507.20534, 2025