Malicious LLM API routers actively perform payload injection and secret exfiltration, with 9 of 428 tested routers showing malicious behavior and further poisoning risks from leaked credentials.
BadNets: Evaluating Backdooring Attacks on Deep Neural Networks
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
baseline 1polarities
baseline 1representative citing papers
Gungnir shows that style-based triggers with RAN and STTR techniques can activate backdoors in diffusion models while evading detection and surviving fine-tuning.
GLA backdoor attack on DriveVLM uses naturalistic graffiti and cross-lingual triggers to reach 90% ASR at 10% poisoning ratio while improving some clean-task metrics like BLEU-1.
citing papers explorer
-
Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain
Malicious LLM API routers actively perform payload injection and secret exfiltration, with 9 of 428 tested routers showing malicious behavior and further poisoning risks from leaked credentials.
-
Gungnir: Exploiting Stylistic Features in Images for Backdoor Attacks on Diffusion Models
Gungnir shows that style-based triggers with RAN and STTR techniques can activate backdoors in diffusion models while evading detection and surviving fine-tuning.
-
Multimodal Backdoor Attack on VLMs for Autonomous Driving via Graffiti and Cross-Lingual Triggers
GLA backdoor attack on DriveVLM uses naturalistic graffiti and cross-lingual triggers to reach 90% ASR at 10% poisoning ratio while improving some clean-task metrics like BLEU-1.
- Your Neighbors Know: Leveraging Local Neighborhoods for Backdoor Detection in Decentralized Learning