pith. machine review for the scientific record. sign in

Adaptive gradient methods at the edge of stability

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

years

2026 5

verdicts

UNVERDICTED 5

representative citing papers

Phases of Muon: When Muon Eclipses SignSGD

math.OC · 2026-05-10 · unverdicted · novelty 7.0

On power-law covariance least squares problems, SignSVD (Muon) and SignSGD (Adam proxy) show three phases of relative performance depending on data exponent α and target exponent β.

A Rod Flow Model for Adam at the Edge of Stability

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

Rod flow models for Adam and related optimizers track discrete iterates at the edge of stability more accurately than standard stable flows across tested ML architectures.

Zeroth-Order Optimization at the Edge of Stability

cs.LG · 2026-04-16 · unverdicted · novelty 7.0

Zeroth-order methods achieve mean-square stability when the step size satisfies a condition involving the entire Hessian spectrum, with full-batch ZO optimizers operating at the edge of stability and large steps regularizing the Hessian trace.

citing papers explorer

Showing 5 of 5 citing papers.

  • Phases of Muon: When Muon Eclipses SignSGD math.OC · 2026-05-10 · unverdicted · none · ref 16

    On power-law covariance least squares problems, SignSVD (Muon) and SignSGD (Adam proxy) show three phases of relative performance depending on data exponent α and target exponent β.

  • Spectral Dynamics in Deep Networks: Feature Learning, Outlier Escape, and Learning Rate Transfer cond-mat.dis-nn · 2026-05-08 · unverdicted · none · ref 26

    A two-level DMFT predicts width-consistent outlier escape and hyperparameter transfer under μP in deep networks, with bulk restructuring dominating for tasks with many outputs.

  • A Rod Flow Model for Adam at the Edge of Stability cs.LG · 2026-05-07 · unverdicted · none · ref 10

    Rod flow models for Adam and related optimizers track discrete iterates at the edge of stability more accurately than standard stable flows across tested ML architectures.

  • Zeroth-Order Optimization at the Edge of Stability cs.LG · 2026-04-16 · unverdicted · none · ref 2

    Zeroth-order methods achieve mean-square stability when the step size satisfies a condition involving the entire Hessian spectrum, with full-batch ZO optimizers operating at the edge of stability and large steps regularizing the Hessian trace.

  • Momentum Further Constrains Sharpness at the Edge of Stochastic Stability cs.LG · 2026-04-15 · unverdicted · none · ref 7

    Momentum SGD exhibits two distinct EoSS regimes for batch sharpness, stabilizing at 2(1-β)/η for small batches and 2(1+β)/η for large batches, aligning with linear stability thresholds.