First DTW-certified robust anomaly detection for time series via randomized smoothing adapted through an l_p-to-DTW lower-bound transformation.
Certified Adversarial Robustness via Randomized Smoothing
7 Pith papers cite this work. Polarity classification is still indexing.
abstract
We show how to turn any classifier that classifies well under Gaussian noise into a new classifier that is certifiably robust to adversarial perturbations under the $\ell_2$ norm. This "randomized smoothing" technique has been proposed recently in the literature, but existing guarantees are loose. We prove a tight robustness guarantee in $\ell_2$ norm for smoothing with Gaussian noise. We use randomized smoothing to obtain an ImageNet classifier with e.g. a certified top-1 accuracy of 49% under adversarial perturbations with $\ell_2$ norm less than 0.5 (=127/255). No certified defense has been shown feasible on ImageNet except for smoothing. On smaller-scale datasets where competing approaches to certified $\ell_2$ robustness are viable, smoothing delivers higher certified accuracies. Our strong empirical results suggest that randomized smoothing is a promising direction for future research into adversarially robust classification. Code and models are available at http://github.com/locuslab/smoothing.
citation-role summary
citation-polarity summary
roles
background 4representative citing papers
A new 7x4 taxonomy organizes agentic AI security threats by architectural layer and persistence timescale, revealing under-explored upper layers and missing defenses after surveying 116 papers.
Landseer offers a containerized modular system to integrate and evaluate combinations of machine learning defenses, with an initial analysis of 35 defenses highlighting replicability challenges.
AI peer review systems are vulnerable to prompt injections, prestige biases, assertion strength effects, and contextual poisoning, as demonstrated by a new attack taxonomy and causal experiments on real conference submissions.
Extremely quantized LLMs exhibit systematic smoothness degradation that reduces effective token candidates and degrades generation; a smoothness-preserving principle in PTQ and QAT delivers gains beyond numerical accuracy.
Jailbreak prompts with adversarial suffixes have high GPT-2 perplexity, and a LightGBM model on perplexity and length detects most attacks.
citing papers explorer
-
Fortifying Time Series: DTW-Certified Robust Anomaly Detection
First DTW-certified robust anomaly detection for time series via randomized smoothing adapted through an l_p-to-DTW lower-bound transformation.
-
A Systematic Survey of Security Threats and Defenses in LLM-Based AI Agents: A Layered Attack Surface Framework
A new 7x4 taxonomy organizes agentic AI security threats by architectural layer and persistence timescale, revealing under-explored upper layers and missing defenses after surveying 116 papers.
-
Landseer: Exploring the Machine Learning Defense Landscape
Landseer offers a containerized modular system to integrate and evaluate combinations of machine learning defenses, with an initial analysis of 35 defenses highlighting replicability challenges.
-
When AI reviews science: Can we trust the referee?
AI peer review systems are vulnerable to prompt injections, prestige biases, assertion strength effects, and contextual poisoning, as demonstrated by a new attack taxonomy and causal experiments on real conference submissions.
-
Fitting Is Not Enough: Smoothness in Extremely Quantized LLMs
Extremely quantized LLMs exhibit systematic smoothness degradation that reduces effective token candidates and degrades generation; a smoothness-preserving principle in PTQ and QAT delivers gains beyond numerical accuracy.
-
Detecting Language Model Attacks with Perplexity
Jailbreak prompts with adversarial suffixes have high GPT-2 perplexity, and a LightGBM model on perplexity and length detects most attacks.
- Unlearning with Asymmetric Sources: Improved Unlearning-Utility Trade-off with Public Data