Finite-width shallow networks remain within poly(d) m^{-min(1,c/6)} of their mean-field limit uniformly in time when mean-field excess loss decays as t^{-c} under standard regularity and an integral condition on the loss.
Quantita- tive convergence of wasserstein gradient flows of kernel mean discrepancies.arXiv preprint arXiv:2603.01977
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 6verdicts
UNVERDICTED 6roles
background 1polarities
background 1representative citing papers
Under Ahlfors regularity of exponent β, the minimal energy distance between a measure and its N-point empirical version decays exactly as N to the power -½(1 + q/β) for power kernels with exponent q in (0,2).
Introduces WSFN, a Newton-type method on Wasserstein space that escapes saddle points in polynomial time and achieves linear convergence to global minimizers under benign landscape assumptions.
Sobolev regularization on the witness function enables global convergence of MMD gradient flows for both sampling and generative modeling without isoperimetric assumptions.
Mean-field SVGD flow converges locally at explicit polynomial L2 rates to the target on the torus for Riesz kernels, with rates depending on dimension and regularity, sharpness in some regimes, and recovery of global exponential convergence for Coulomb kernels.
Lifts CCCP to Wasserstein space for DC functionals on measures, proves almost stationarity under smoothness/strong-convexity assumptions, and applies to MMD/ED with local convergence and faster empirical runs.
citing papers explorer
-
Sharp Rates of MMD Empirical Estimation with Power Kernels
Under Ahlfors regularity of exponent β, the minimal energy distance between a measure and its N-point empirical version decays exactly as N to the power -½(1 + q/β) for power kernels with exponent q in (0,2).