pith. machine review for the scientific record. sign in

Adversarial robustness as a prior for learned representations

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

years

2026 2 2022 1

representative citing papers

Toy Models of Superposition

cs.LG · 2022-09-21 · accept · novelty 8.0

Toy models demonstrate that polysemanticity arises when neural networks store more sparse features than neurons via superposition, producing a phase transition tied to polytope geometry and increased adversarial vulnerability.

Laundering AI Authority with Adversarial Examples

cs.CR · 2026-05-05 · unverdicted · novelty 5.0

Adversarial examples enable AI authority laundering by causing production VLMs to give authoritative but wrong responses on subtly perturbed images, with success rates of 22-100% using decade-old attack methods.

citing papers explorer

Showing 3 of 3 citing papers.

  • Toy Models of Superposition cs.LG · 2022-09-21 · accept · none · ref 14

    Toy models demonstrate that polysemanticity arises when neural networks store more sparse features than neurons via superposition, producing a phase transition tied to polytope geometry and increased adversarial vulnerability.

  • Adjoint Inversion Reveals Holographic Superposition and Destructive Interference in CNN Classifiers cs.CV · 2026-04-30 · unverdicted · none · ref 5

    CNN classifiers work by holographic superposition and destructive interference in pixel space rather than selecting cleaned features, as proven by a new adjoint inversion framework that also yields a covariance-volume channel selection algorithm.

  • Laundering AI Authority with Adversarial Examples cs.CR · 2026-05-05 · unverdicted · none · ref 21

    Adversarial examples enable AI authority laundering by causing production VLMs to give authoritative but wrong responses on subtly perturbed images, with success rates of 22-100% using decade-old attack methods.