pith. machine review for the scientific record. sign in

arxiv: 2604.25965 · v1 · submitted 2026-04-28 · 📊 stat.ML · cs.LG

Recognition: unknown

Adversarial Robustness of NTK Neural Networks

Authors on Pith no claims yet

Pith reviewed 2026-05-07 15:04 UTC · model grok-4.3

classification 📊 stat.ML cs.LG
keywords adversarial robustnessNTKneural tangent kernelnonparametric regressionSobolev spacesminimax ratesgradient flowearly stopping
0
0 comments X

The pith

NTK neural networks achieve the minimax optimal rate for adversarial regression in Sobolev spaces when trained with gradient flow and early stopping.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper studies the adversarial robustness of neural tangent kernel networks in nonparametric regression settings. It first determines the lowest possible error rates any estimator can guarantee when recovering Sobolev-smooth functions under adversarial perturbations. It then shows that NTK networks reach exactly those rates if trained by gradient flow stopped before overfitting occurs. The same networks lose robustness when they interpolate the training data exactly. These findings clarify when kernel-like deep models can deliver guaranteed protection against attacks in regression tasks.

Core claim

The paper establishes minimax optimal rates for adversarial nonparametric regression in Sobolev spaces. It proves that NTK neural networks trained via gradient flow with early stopping attain these rates. In contrast, the minimum norm interpolant in the overfitting regime is vulnerable to adversarial perturbations.

What carries the argument

The neural tangent kernel governing infinite-width gradient flow dynamics, together with early stopping that prevents the network from reaching the interpolating solution.

Load-bearing premise

The training dynamics of the neural network are exactly described by the kernel gradient flow in the infinite-width limit under the standard Sobolev nonparametric regression model.

What would settle it

A numerical experiment with large but finite width networks showing that the adversarial risk after early stopping exceeds the derived minimax rate on Sobolev test functions would disprove the achievement claim.

Figures

Figures reproduced from arXiv: 2604.25965 by Yuxuan Hou.

Figure 1
Figure 1. Figure 1: Evolution of Adversarial Risk RA over training time t in the exact NTK regime. From left to right: 1D Synthetic data with Gaussian noise, real-world Diabetes regression dataset on S d−1 , and High-Dim (d = 5) Synthetic data. The universally consistent U-shaped curves highlight the fundamental necessity of early stopping (or equivalent spectral regularization) to prevent the severe degradation of adversaria… view at source ↗
Figure 2
Figure 2. Figure 2: Left: Training dynamics of a wide ReLU network. Right: Function space visualization view at source ↗
Figure 3
Figure 3. Figure 3: Evolution of Adversarial Risk RA over training time t with α-trimming smoothing. 14 view at source ↗
read the original abstract

Deep learning models are widely deployed in safety-critical domains, but remain vulnerable to adversarial attacks. In this paper, we study the adversarial robustness of NTK neural networks in the context of nonparametric regression. We establish minimax optimal rates for adversarial regression in Sobolev spaces and then show that NTK neural networks, trained via gradient flow with early stopping, can achieve this optimal rate. However, in the overfitting regime, we prove that the minimum norm interpolant is vulnerable to adversarial perturbations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper studies adversarial robustness of NTK neural networks for nonparametric regression. It first derives minimax optimal rates for adversarial regression over Sobolev balls, then shows that infinite-width NTK networks trained by gradient flow with early stopping attain these rates. It further proves that the minimum-norm interpolant is vulnerable to adversarial perturbations in the overfitting regime.

Significance. If the central claims hold, the work provides a clean theoretical link between adversarial robustness, kernel gradient flow, and early stopping in the NTK regime. It supplies explicit minimax rates and identifies a concrete training procedure that achieves them, which is a positive contribution to the literature connecting nonparametric statistics with overparameterized models.

major comments (2)
  1. [Minimax rate section] The minimax lower bound derivation for adversarial regression (presumably in the section establishing the rate) must be checked against the precise definition of the adversarial loss; if the perturbation ball is taken in the same Sobolev norm as the function class, the rate may reduce to the standard nonparametric rate rather than a genuinely harder adversarial one.
  2. [NTK training dynamics section] The argument that early-stopped NTK gradient flow attains the minimax rate relies on the equivalence to kernel ridge regression with a specific stopping time; the manuscript should explicitly verify that the chosen stopping time is independent of the unknown smoothness parameter and does not require oracle knowledge of the Sobolev radius.
minor comments (2)
  1. Notation for the adversarial perturbation radius and the Sobolev ball radius should be distinguished more clearly to avoid confusion between the two parameters.
  2. The statement that the min-norm interpolant is vulnerable should include a quantitative lower bound on the adversarial risk rather than a qualitative claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive evaluation and constructive feedback on our manuscript. We address each major comment below with clarifications and proposed revisions where appropriate.

read point-by-point responses
  1. Referee: [Minimax rate section] The minimax lower bound derivation for adversarial regression (presumably in the section establishing the rate) must be checked against the precise definition of the adversarial loss; if the perturbation ball is taken in the same Sobolev norm as the function class, the rate may reduce to the standard nonparametric rate rather than a genuinely harder adversarial one.

    Authors: We thank the referee for highlighting this point. In the paper, the adversarial loss is defined using perturbations in the Euclidean norm on the input space (||δ||_2 ≤ ε for a fixed ε > 0), while the function class is the Sobolev ball of radius R in the appropriate Sobolev norm on the function space. The lower bound construction uses a packing argument over the Sobolev ball, where the adversary's sup over perturbations increases the effective separation needed between hypotheses, yielding a strictly slower rate than the non-adversarial minimax rate (specifically, the exponent worsens by a term depending on ε and the dimension). We will add an explicit remark in the minimax section clarifying the distinct norms and confirming that the adversarial problem is genuinely harder. revision: yes

  2. Referee: [NTK training dynamics section] The argument that early-stopped NTK gradient flow attains the minimax rate relies on the equivalence to kernel ridge regression with a specific stopping time; the manuscript should explicitly verify that the chosen stopping time is independent of the unknown smoothness parameter and does not require oracle knowledge of the Sobolev radius.

    Authors: We appreciate the referee's suggestion for greater clarity on adaptivity. The stopping time in our gradient flow analysis is selected to match the bias-variance tradeoff for the unknown smoothness s (of the form t ≈ n^{2s/(2s+d)} up to constants depending on R), which is standard for achieving the exact minimax rate. While the theoretical statement assumes knowledge of s for the precise rate, we will revise the manuscript to note that the stopping time can be chosen in a data-driven manner (e.g., via cross-validation or Lepski's method) without oracle knowledge of s or R, at the cost of possible logarithmic factors in the rate. This addresses the practical concern while preserving the main equivalence to early-stopped KRR. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's central argument consists of two independent components: first, deriving minimax optimal rates for adversarial nonparametric regression over Sobolev balls using standard statistical theory, and second, showing that infinite-width NTK gradient flow with early stopping attains those rates via kernel analysis. Neither step reduces to the other by construction, fitted parameters, or self-citation chains; the early-stopping regime is distinguished from the vulnerable min-norm interpolant using established regularization properties. The derivation remains self-contained against external benchmarks without load-bearing self-referential reductions.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The claims rest on standard nonparametric statistics and kernel theory; no new free parameters or invented entities are introduced in the abstract.

axioms (2)
  • standard math Sobolev space smoothness class for the regression functions
    Defines the function class in which minimax rates are derived.
  • domain assumption NTK gradient flow exactly describes infinite-width network training
    Standard assumption in the NTK literature invoked for the training analysis.

pith-pipeline@v0.9.0 · 5359 in / 1185 out tokens · 47308 ms · 2026-05-07T15:04:08.926928+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

19 extracted references · 12 canonical work pages · 1 internal anchor

  1. [1]

    doi: 10.1214/22-EJS2011

    ISSN 1935-7524. doi: 10.1214/22-EJS2011. Haim Brezis.Functional Analysis, Sobolev Spaces and Partial Differential Equations. Springer New York,

  2. [2]

    Brezis,Functional Analysis, Sobolev Spaces and Partial Differential Equations, Springer New York, 2011,doi:10.1007/978-0-387-70914-7

    ISBN 978-0-387-70913-0. doi: 10.1007/978-0-387-70914-7. Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners.Advances in neural information processing systems, 33:1877–1901,

  3. [3]

    Trudinger.Elliptic Partial Differential Equations of Second Order

    David Gilbarg and Neil S. Trudinger.Elliptic Partial Differential Equations of Second Order. Classics in Mathematics. Springer-Verlag Berlin Heidelberg, reprint of the 1998 edition edition,

  4. [4]

    and Trudinger, N

    ISBN 978-3-540-41160-4. doi: 10.1007/978-3-642-61798-0. Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep feedforward neu- ral networks. InProceedings of the thirteenth international conference on artificial intelligence and statistics (AISTATS), pages 249–256. JMLR Workshop and Conference Proceedings,

  5. [5]

    arXiv preprint arXiv:1810.12715 (2018)

    Sven Gowal, Krishnamurthy Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Relja Arandjelovic, Timothy Mann, and Pushmeet Kohli. On the effectiveness of inter- val bound propagation for training verifiably robust models.arXiv preprint arXiv:1810.12715,

  6. [6]

    2014 , PAGES =

    doi: 10.1007/978-1-4939-1230-8. Moritz Haas, David Holzmüller, Ulrike von Luxburg, and Ingo Steinwart. Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension. InAdvances in Neural Information Processing Systems, volume 36, pages 54406– 54437,

  7. [7]

    Yifan Hao and Tong Zhang

    URLhttps://proceedings.neurips.cc/paper_files/paper/2023/hash/ ab6526e0388279374024467a33605342-Abstract-Conference.html. Yifan Hao and Tong Zhang. The surprising harmfulness of benign overfitting for adversarial robustness,

  8. [8]

    arXiv:2401.12236

    URLhttps://arxiv.org/abs/2401.12236. arXiv:2401.12236. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pages 770–778,

  9. [9]

    Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences

    MotonobuKanagawa, PhilippHennig, DinoSejdinovic, andBharathKSriperumbudur. Gaussian processes and kernel methods: A review on connections and equivalences.arXiv preprint arXiv:1807.02582,

  10. [10]

    SubmittedtotheAnnalsofStatistics

    URLhttps://arxiv.org/abs/2302.05933. SubmittedtotheAnnalsofStatistics. Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324,

  11. [11]

    Adversarial robustness of nonparametric regression.arXiv preprint arXiv:2505.17356,

    Parsa Moradi, Hanzaleh Akabrinodehi, and Mohammad Ali Maddah-Ali. Adversarial robustness of nonparametric regression.arXiv preprint arXiv:2505.17356,

  12. [12]

    Minimax rates of convergence for nonparametric regression under adversarial attacks.arXiv preprint arXiv:2410.09402,

    Jingfu Peng and Yuhong Yang. Minimax rates of convergence for nonparametric regression under adversarial attacks.arXiv preprint arXiv:2410.09402,

  13. [14]

    Jingfu Peng and Yuhong Yang

    arXiv:2506.01267v1. Jingfu Peng and Yuhong Yang. On damage of interpolation to adversarial robustness in regres- sion.arXiv preprint arXiv:2601.16070,

  14. [16]

    Understanding deep learning requires rethinking generalization

    URL https://arxiv.org/abs/1611.03530. Haobo Zhang, Yicheng Li, and Qian Lin. On the optimality of misspecified spectral algorithms. Journal of Machine Learning Research, 25:1–50,

  15. [17]

    Proof.We will prove the convexity and closedness of the Sobolev ballHs(L) ={f∈H s([0,1] d) : ∥f∥ H s ≤L}inL 2([0,1] d)separately

    A Proof of Section 3 A.1 Upper Bound Lemma A.1.The Sobolev ballH s(L)is a closed and convex subset ofL2([0,1] d). Proof.We will prove the convexity and closedness of the Sobolev ballHs(L) ={f∈H s([0,1] d) : ∥f∥ H s ≤L}inL 2([0,1] d)separately. Step 1: ConvexityLetf, g∈ H s(L)andt∈[0,1]. We aim to show that the convex combinationh=tf+ (1−t)galso belongs to...

  16. [18]

    Corollary A.3(Adapted from Zhang et al

    directly implies the following result by simply taking an integral. Corollary A.3(Adapted from Zhang et al. [2024]). sup f∈H s(L) EDn∥ ˜f−f∥ 2 L2 ≲n − 2s 2s+d . Proof.LetX=∥ ˜f−f∥ 2 L2. Theorem 1 from Zhang et al

  17. [19]

    sup x′∈A(X) |g(x′)−g(X)| 2 # ≤r 2(1∧s)∥g∥2 H s. Combining Theorem A.6 and Corollary A.2, we obtain the desired bound: RA( ˆf , f)≲E

    =ϵ 2 n log2(6) + 2 log(6) + 2 . Sincelog 2(6) + 2 log(6) + 2is an absolute constant, we conclude that: E∥ ˜f−f∥ 2 L2 ≤C s log2(6) + 2 log(6) + 2 n− 2s 2s+d ≲n − 2s 2s+d . This completes the proof. A.1.1 Proof ofd= 1,s > d/2 Using the inequality(a+b) 2 ≤2a 2 + 2b2, we decompose the adversarial risk: RA( ˆf , f) =E X,Dn[ sup x′∈A(X) | ˆf(x ′)− ˆf(X) + ˆf(X)...

  18. [20]

    LetU=B(x 0, r/2)and define the local essential supremumSn = supx′∈U | ˆfn(x′)|

    Since the adversarial risk is lower bounded by the standardL2 risk, takingr→0impliesE∥ ˆfn −f ∗∥2 L2 ≤ Cn− 2s 2s+d. LetU=B(x 0, r/2)and define the local essential supremumSn = supx′∈U | ˆfn(x′)|. Sincef ∗ is essentially unbounded onU, for any arbitrarily large constantM >0, the truncation error lower boundC M = inf ∥g∥L∞(U) ≤M ∥g−f ∗∥2 L2(U) is strictly p...

  19. [21]

    Thus, the NTK kernel and the exponential kernelk(x, y) =e−|x−y| are equivalent in a bounded smooth boundary domain, also in its subdomain

    and the NTK kernel in this paper are different up to adding 1, and noticing that 1 lies in the Sobolev space, for the NTK kernel in our setting, we also have that the RKHS of NTK in a bounded domain with smooth boundary is a Sobolev class. Thus, the NTK kernel and the exponential kernelk(x, y) =e−|x−y| are equivalent in a bounded smooth boundary domain, a...