Spectrally Safe Neural Operator Warm-Starts for Large-Scale Newton Solvers
read the original abstract
Neural operators are increasingly used to warm-start Newton solvers for nonlinear PDEs, on the premise that a low test error places the initial guess inside the basin of attraction. We show that this premise is unreliable. An operator trained to the relative \(L^2\) error \(O(10^{-3})\) can still produce an initial state in which the discrete Jacobian is indefinite, because the mean-squared training controls error on average while leaving localized pointwise violations of the underlying physics. For a nearly incompressible hyperelasticity problem, we trace this to the predicted volume change: the operator disperses \(\mathrm{det} F\) well away from one, and the resulting Jacobian acquires negative eigenvalues even when the predicted field is visually indistinguishable from the reference. At a small scale, this is a nuisance; at a multi-million degree-of-freedom scale, it is disqualifying, since the conjugate gradient and other Krylov solvers needed for memory-feasible Newton steps assume a definite spectrum. We then show that a short, label-free fine-tuning phase -- penalizing the operator against the discrete energy, with no additional solution data -- shifts the Jacobian spectrum back to positive definite. Combined with an inexact outer loop, this gives a warm-started Newton method that converges across the full loading range where the unregularized operator fails, reaching up to 5.4\(\times\) wall-clock speedup over incremental continuation on a 3D problem with 6.4 million degrees of freedom.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.