Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

David Wagner; Nicholas Carlini

arxiv: 1801.01944 · v2 · pith:3QO2RYBJnew · submitted 2018-01-05 · 💻 cs.LG · cs.AI· cs.CR

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Nicholas Carlini , David Wagner This is my paper

classification 💻 cs.LG cs.AIcs.CR

keywords audioadversarialexamplesattacktargetedanotherapplyattacks

0 comments

read the original abstract

We construct targeted audio adversarial examples on automatic speech recognition. Given any audio waveform, we can produce another that is over 99.9% similar, but transcribes as any phrase we choose (recognizing up to 50 characters per second of audio). We apply our white-box iterative optimization-based attack to Mozilla's implementation DeepSpeech end-to-end, and show it has a 100% success rate. The feasibility of this attack introduce a new domain to study adversarial examples.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Codec-Robust Attacks on Audio LLMs
cs.SD 2026-05 unverdicted novelty 7.0

CodecAttack perturbs audio in codec latent space with multi-bitrate EoT to achieve 85.5% average ASR on Opus-compressed Audio LLMs versus under 26% for waveform baselines, with transfer to MP3 and AAC.
Codec-Robust Attacks on Audio LLMs
cs.SD 2026-05 unverdicted novelty 6.0

CodecAttack optimizes perturbations in neural audio codec latent space to reach 85.5% average target-substring ASR on compressed Opus audio while waveform baselines stay below 26%.
Mobile GUI Agents under Real-world Threats: Are We There Yet?
cs.CR 2025-07 conditional novelty 6.0

Introduces an app-content instrumentation framework and benchmark showing that examined GUI agents suffer 42.0% and 36.1% average misleading rates from third-party content in dynamic and static tests respectively.