TAKO demonstrates real-time adversarial takeover of robotic diffusion policies via reusable universal patches on visual inputs, achieving 100% success in steering attacker-chosen trajectories across multiple tasks, encoders, and diffusion methods.
hub
Adversarial Patch
22 Pith papers cite this work. Polarity classification is still indexing.
abstract
We present a method to create universal, robust, targeted adversarial image patches in the real world. The patches are universal because they can be used to attack any scene, robust because they work under a wide variety of transformations, and targeted because they can cause a classifier to output any target class. These adversarial patches can be printed, added to any scene, photographed, and presented to image classifiers; even when the patches are small, they cause the classifiers to ignore the other items in the scene and report a chosen target class. To reproduce the results from the paper, our code is available at https://github.com/tensorflow/cleverhans/tree/master/examples/adversarial_patch
hub tools
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
ForesightSafety-VLA creates a diagnostic benchmark for VLA safety with taxonomy across physical, language, and visual risks, showing perception and structure variations cause more safety degradation than language changes in tested models.
ProSA auditing framework shows that structural loss metrics track OCR instability and downstream QA degradation far better than area-based footprint measures across two document parsers on 1,000 pages.
Thermally activated clothing with thermochromic dyes and heaters creates dynamic adversarial patterns that evade AI surveillance in visible and infrared modalities while appearing ordinary when inactive.
Adversarial hubs can be generated to be retrieved as top-1 for over 84% of test queries in text-to-image retrieval, far exceeding natural hubs.
Derail adversarial perturbations hijack the scoring head in generative E2E driving planners, flipping safe to unsafe trajectory selection with 39-80% score drops and up to 50% collision rates.
Groups of users can extract a proxy from a black-box model via pooled queries and apply optimized per-class perturbations at test time to close most subgroup accuracy gaps on image classification tasks.
A reinforcement learning attacker manipulates client sensor observations in federated learning to induce repetitive server memory updates, achieving around 70% repeated update rate and enabling remote Rowhammer bit flips on an automatic speech recognition model.
A DIP-based optimization produces adversarial perturbations and patches that are more robust to affine transformations than standard high-frequency noise while staying imperceptible.
Introduces CADEX to generate domain-constrained counterfactual explanations for ML models using adversarial perturbations.
A physical patch suppresses all object detections by YOLOv3 even for distant objects without overlapping them.
TRAP is a tail-aware ranking attack that plants a backdoor in world models so that a trigger causes the model to reorder a few critical imagined trajectories and redirect planning while preserving normal behavior on clean inputs.
AdvAD produces physical-world adversarial patches with improved transferability to unseen object detectors by multi-model optimization, adaptive balancing, and physical variation robustness.
TriPatch generates transferable physical adversarial patches via multi-stage triplet loss, appearance consistency, and data augmentation to achieve higher attack success rates on pedestrian detectors than prior methods.
SPAR is a street-legal physical rim that cuts modern ALPR accuracy by 60% and reaches 18% targeted impersonation while costing under $100 and requiring no plate modification.
A decision-support framework applies AFT models to show Nvidia L4 GPUs yield 20% longer adversarial survival time at 75% lower cost than V100, with inference latency as the strongest robustness predictor.
Adversarial patches transfer across three VLM architectures in autonomous driving scenarios with 73-91% success rates and affect 65-79% of critical decision frames even without target-specific optimization.
AEGIS combines SemantiGAN filtering with evidential learning on five handcrafted instability metrics to detect adversarial attacks, reporting 92.1% AUROC on Tiny ImageNet across six attack types.
Empirical comparison finds no single inference-time defense dominates for MLLMs, combinations cause 97-100% over-refusal on benign queries, and adaptive selection based on model and attack type is recommended.
Digitally optimized adversarial patches with printability constraints reduce objectness in YOLOv3 aerial detectors, with physical transfer success varying by ON/OFF configuration and weather augmentation showing limited benefit.
RACF corrects inconsistent depth camera distance estimates in autonomous vehicles using LiDAR and kinematic redundancy, achieving up to 35% RMSE reduction and better braking in tests on a Quanser QCar 2 platform.
The paper organizes existing physical adversarial attack literature into a surveillance-oriented taxonomy emphasizing temporal persistence, multi-modal sensing, carrier realism, and system-level objectives, concluding that robustness requires system-level evaluation over time and across sensors.
citing papers explorer
-
TRAP: Tail-aware Ranking Attack for World-Model Planning
TRAP is a tail-aware ranking attack that plants a backdoor in world models so that a trigger causes the model to reorder a few critical imagined trajectories and redirect planning while preserving normal behavior on clean inputs.
-
Street-Legal Physical-World Adversarial Rim for License Plates
SPAR is a street-legal physical rim that cuts modern ALPR accuracy by 60% and reaches 18% targeted impersonation while costing under $100 and requiring no plate modification.