citation dossier

Adam: A method for stochastic optimization

Diederik Kingma and Jimmy Ba · 2015

2Pith papers citing it

2reference links

cs.CLtop field · 1 papers

ACCEPTtop verdict bucket · 1 papers

This DOI or bibliographic work is known through the citation graph. Pith is enriching metadata through Crossref/OpenAlex; full non-arXiv reviews need publisher/open-access PDF resolution.

why this work matters in Pith

Pith has found this work in 2 reviewed papers. Its strongest current cluster is cs.CL (1 papers). The largest review-status bucket among citing papers is ACCEPT (1 papers). For highly cited works, this page shows a dossier first and a bounded explorer second; it never tries to render every citing paper at once.

representative citing papers

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

cs.LG · 2017-01-23 · accept · novelty 8.0

A noisy top-k gated mixture-of-experts layer between LSTMs scales neural networks to 137B parameters with sub-linear compute, beating SOTA on language modeling and machine translation.

Attention Is All You Need

cs.CL · 2017-06-12 · unverdicted · novelty 5.0

Pith review generated a malformed one-line summary.

citing papers explorer

Showing 2 of 2 citing papers.

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer cs.LG · 2017-01-23 · accept · none · ref 28
A noisy top-k gated mixture-of-experts layer between LSTMs scales neural networks to 137B parameters with sub-linear compute, beating SOTA on language modeling and machine translation.
Attention Is All You Need cs.CL · 2017-06-12 · unverdicted · none · ref 20
Pith review generated a malformed one-line summary.

Adam: A method for stochastic optimization

why this work matters in Pith

fields

years

verdicts

representative citing papers

citing papers explorer