pith. sign in

mega hub Mixed citations

Adam: A Method for Stochastic Optimization

Mixed citation behavior. Most common role is method (50%).

2073 Pith papers citing it
Method 50% of classified citations
abstract

We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

hub tools

citation-role summary

method 117 background 97 other 9 baseline 8 dataset 2

citation-polarity summary

claims ledger

  • abstract We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little

authors

mega hub controls

Recognition alignment

counterfactual ablation

If this work disappeared, these are the nearest dependency candidates in Pith, weighted toward method, dataset, baseline, and extension contexts where available. This is a structural signal, not a retraction verdict.

co-cited works

clear filters

representative citing papers

ShardNet: Training Neural Controllers with Hard, Non-Convex Constraints

eess.SY · 2026-06-29 · unverdicted · novelty 8.0

ShardNet enforces non-convex polyhedral safety constraints in neural controllers by construction via a differentiable projection layer, achieving 100% verified safety and over 3x larger safe sets than prior methods on double integrator benchmarks.

Adam Converges in Nonsmooth Nonconvex Optimization

math.OC · 2026-06-21 · unverdicted · novelty 8.0

The paper establishes the first finite-time convergence rate of 1/T^{2/13} for classical Adam (with bias correction, no extra steps) in nonsmooth nonconvex optimization under heavy-tailed noise with β1=β2.

Adaptive directional gradients for parameterised quantum circuits

quant-ph · 2026-06-08 · unverdicted · novelty 8.0

Forward gradient framework for PQCs unifies SPSA and parameter-shift as limits, introduces QUIVER adaptive optimizer with closed-form measurement allocation, and demonstrates efficient training of 60-qubit circuits on ECG5000 and MNIST.

Neutron Star Equation of State via Physics Informed Neural Network

astro-ph.HE · 2026-05-29 · unverdicted · novelty 8.0

PINNs are used to non-parametrically infer the neutron star EOS from NICER and pulsar data, producing M_max = 2.06 M_sun, R_1.4 = 12.85 km, and a reproducible speed-of-sound softening at 2-4 rho_0 consistent with quark-hadron crossover.

ENSEMBITS: an alphabet of protein conformational ensembles

cs.LG · 2026-05-13 · unverdicted · novelty 8.0 · 2 refs

Ensembits is the first tokenizer of protein conformational ensembles that outperforms static tokenizers on RMSF prediction and matches them on function and mutation tasks while using less pretraining data.

SLayerGen: a Crystal Generative Model for all Space and Layer Groups

cond-mat.mtrl-sci · 2026-05-07 · unverdicted · novelty 8.0

SLayerGen generates crystals invariant to any space or layer group via autoregressive lattice and Wyckoff sampling plus equivariant diffusion, achieving gains over bulk models on diperiodic materials after correcting a prior loss inconsistency for hexagonal groups.

3DSS: 3D Surface Splatting for Inverse Rendering

cs.GR · 2026-05-07 · unverdicted · novelty 8.0 · 3 refs

3DSS is the first differentiable surface splatting renderer that recovers shape, spatially-varying BRDF materials, and HDR illumination from multi-view images via a coverage-based compositing model derived from reconstruction kernels.

MMGait: Towards Multi-Modal Gait Recognition

cs.CV · 2026-04-17 · conditional · novelty 8.0

MMGait provides a new multi-sensor gait dataset and OmniGait baseline to support single-modal, cross-modal, and unified multi-modal person identification from walking patterns.

CMCC-ReID: Cross-Modality Clothing-Change Person Re-Identification

cs.CV · 2026-04-03 · unverdicted · novelty 8.0

The paper introduces the CMCC-ReID task, constructs the SYSU-CMCC benchmark dataset, and proposes the PIA network with disentangling and prototype modules that outperforms prior methods on combined modality and clothing variations.

citing papers explorer

Showing 50 of 67 citing papers after filters.