Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks

David Evans, Weilin Xu, Yanjun Qi

Authors on Pith no claims yet

classification 💻 cs.CV cs.CRcs.LG

keywords examplesfeatureadversarialsqueezingdeepdetectingdetectionemph

read the original abstract

Although deep neural networks (DNNs) have achieved great success in many tasks, they can often be fooled by \emph{adversarial examples} that are generated by adding small but purposeful distortions to natural examples. Previous studies to defend against adversarial examples mostly focused on refining the DNN models, but have either shown limited success or required expensive computation. We propose a new strategy, \emph{feature squeezing}, that can be used to harden DNN models by detecting adversarial examples. Feature squeezing reduces the search space available to an adversary by coalescing samples that correspond to many different feature vectors in the original space into a single sample. By comparing a DNN model's prediction on the original input with that on squeezed inputs, feature squeezing detects adversarial examples with high accuracy and few false positives. This paper explores two feature squeezing methods: reducing the color bit depth of each pixel and spatial smoothing. These simple strategies are inexpensive and complementary to other defenses, and can be combined in a joint detection framework to achieve high detection rates against state-of-the-art attacks.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Physical Adversarial Clothing Evades Visible-Thermal Detectors via Non-Overlapping RGB-T Pattern
cs.CV 2026-05 unverdicted novelty 6.0

Non-overlapping RGB-T adversarial patterns on clothing, optimized with spatial discrete-continuous optimization, achieve high attack success rates against multiple RGB-T detector fusion architectures in both digital a...
Detecting Adversarial Data via Provable Adversarial Noise Amplification
cs.LG 2026-05 unverdicted novelty 6.0

A provable adversarial noise amplification theorem under sufficient conditions enables a custom-trained detector that identifies adversarial examples at inference time using enhanced layer-wise noise signals.
TRAP: Tail-aware Ranking Attack for World-Model Planning
cs.LG 2026-05 unverdicted novelty 6.0

TRAP is a tail-aware ranking attack that plants a backdoor in world models so that a trigger causes the model to reorder a few critical imagined trajectories and redirect planning while preserving normal behavior on c...