Adam: A method for stochastic optimization

Diederik Kingma, Jimmy Ba · 2015

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

cs.LG · 2017-01-23 · accept · novelty 8.0

A noisy top-k gated mixture-of-experts layer between LSTMs scales neural networks to 137B parameters with sub-linear compute, beating SOTA on language modeling and machine translation.

Attention Is All You Need

cs.CL · 2017-06-12 · unverdicted · novelty 5.0

Pith review generated a malformed one-line summary.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Attention Is All You Need cs.CL · 2017-06-12 · unverdicted · none · ref 20
Pith review generated a malformed one-line summary.

Adam: A method for stochastic optimization

fields

years

verdicts

representative citing papers

citing papers explorer