Adversarial Attacks and Defences: A Survey

Anirban Chakraborty , Manaar Alam , Vishal Dey , Anupam Chattopadhyay , Debdeep Mukhopadhyay

Authors on Pith no claims yet

classification 💻 cs.LG cs.CRstat.ML

keywords learningdeepadversarialadversariesrecenttypesattackscountermeasures

read the original abstract

Deep learning has emerged as a strong and efficient framework that can be applied to a broad spectrum of complex learning problems which were difficult to solve using the traditional machine learning techniques in the past. In the last few years, deep learning has advanced radically in such a way that it can surpass human-level performance on a number of tasks. As a consequence, deep learning is being extensively used in most of the recent day-to-day applications. However, security of deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify the output. In recent times, different types of adversaries based on their threat model leverage these vulnerabilities to compromise a deep learning system where adversaries have high incentives. Hence, it is extremely important to provide robustness to deep learning algorithms against these adversaries. However, there are only a few strong countermeasures which can be used in all types of attack scenarios to design a robust deep learning system. In this paper, we attempt to provide a detailed discussion on different types of adversarial attacks with various threat models and also elaborate the efficiency and challenges of recent countermeasures against them.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Fortifying Time Series: DTW-Certified Robust Anomaly Detection
cs.LG 2026-05 unverdicted novelty 8.0

First DTW-certified robust anomaly detection for time series via randomized smoothing adapted through an l_p-to-DTW lower-bound transformation.
Can Drift-Adaptive Malware Detectors Be Made Robust? Attacks and Defenses Under White-Box and Black-Box Threats
cs.CR 2026-04 unverdicted novelty 7.0

A fine-tuning framework reduces PGD attack success on AdvDA detectors from 100% to 3.2% and MalGuise from 13% to 5.1%, but optimal training strategies differ by threat model and robustness does not transfer across them.
Jailbroken: How Does LLM Safety Training Fail?
cs.LG 2023-07 unverdicted novelty 6.0

LLM safety training fails due to competing objectives and mismatched generalization, enabling new jailbreaks that succeed on all unsafe prompts from red-teaming sets in GPT-4 and Claude.