pith. sign in

mega hub Mixed citations

Adam: A Method for Stochastic Optimization

Mixed citation behavior. Most common role is method (50%).

1599 Pith papers citing it
Method 50% of classified citations
abstract

We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

hub tools

citation-role summary

method 117 background 96 other 9 baseline 8 dataset 2

citation-polarity summary

claims ledger

  • abstract We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little

authors

mega hub controls

Recognition alignment

counterfactual ablation

If this work disappeared, these are the nearest dependency candidates in Pith, weighted toward method, dataset, baseline, and extension contexts where available. This is a structural signal, not a retraction verdict.

co-cited works

representative citing papers

ENSEMBITS: an alphabet of protein conformational ensembles

cs.LG · 2026-05-13 · unverdicted · novelty 8.0 · 2 refs

Ensembits is the first tokenizer of protein conformational ensembles that outperforms static tokenizers on RMSF prediction and matches them on function and mutation tasks while using less pretraining data.

SLayerGen: a Crystal Generative Model for all Space and Layer Groups

cond-mat.mtrl-sci · 2026-05-07 · unverdicted · novelty 8.0

SLayerGen generates crystals invariant to any space or layer group via autoregressive lattice and Wyckoff sampling plus equivariant diffusion, achieving gains over bulk models on diperiodic materials after correcting a prior loss inconsistency for hexagonal groups.

3DSS: 3D Surface Splatting for Inverse Rendering

cs.GR · 2026-05-07 · unverdicted · novelty 8.0 · 3 refs

3DSS is the first differentiable surface splatting renderer that recovers shape, spatially-varying BRDF materials, and HDR illumination from multi-view images via a coverage-based compositing model derived from reconstruction kernels.

MMGait: Towards Multi-Modal Gait Recognition

cs.CV · 2026-04-17 · conditional · novelty 8.0

MMGait provides a new multi-sensor gait dataset and OmniGait baseline to support single-modal, cross-modal, and unified multi-modal person identification from walking patterns.

CMCC-ReID: Cross-Modality Clothing-Change Person Re-Identification

cs.CV · 2026-04-03 · unverdicted · novelty 8.0

The paper introduces the CMCC-ReID task, constructs the SYSU-CMCC benchmark dataset, and proposes the PIA network with disentangling and prototype modules that outperforms prior methods on combined modality and clothing variations.

Discovering Latent Knowledge in Language Models Without Supervision

cs.CL · 2022-12-07 · conditional · novelty 8.0

An unsupervised technique extracts latent yes-no knowledge from language model activations by locating a direction that satisfies logical consistency properties, outperforming zero-shot accuracy by 4% on average across models and datasets.

Locating and Editing Factual Associations in GPT

cs.CL · 2022-02-10 · accept · novelty 8.0

Factual associations in autoregressive transformers are localized to mid-layer feed-forward modules and can be edited via rank-one model editing while preserving both specificity and generalization on counterfactual tests.

citing papers explorer

Showing 50 of 1599 citing papers.