Adjusting for dropout variance in batch normalization and weight initialization

Dan Hendrycks, Kevin Gimpel · 2016

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

cs.LG · 2016-06-27 · conditional · novelty 6.0

GELU activation xΦ(x) outperforms ReLU and ELU on computer vision, NLP, and speech tasks by weighting inputs by value rather than gating by sign.

Showing 1 of 1 citing paper.

Gaussian Error Linear Units (GELUs) cs.LG · 2016-06-27 · conditional · none · ref 33
GELU activation xΦ(x) outperforms ReLU and ELU on computer vision, NLP, and speech tasks by weighting inputs by value rather than gating by sign.