pith. machine review for the scientific record. sign in

Rae, Anna Potapenko, Siddhant M

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.CL 1 cs.LG 1

years

2021 1 2020 1

verdicts

UNVERDICTED 2

representative citing papers

Rethinking Attention with Performers

cs.LG · 2020-09-30 · unverdicted · novelty 7.0

Performers approximate full-rank softmax attention in Transformers via FAVOR+ random features for linear complexity, with theoretical guarantees of unbiased estimation and competitive results on pixel, text, and protein tasks.

citing papers explorer

Showing 2 of 2 citing papers.

  • Rethinking Attention with Performers cs.LG · 2020-09-30 · unverdicted · none · ref 143

    Performers approximate full-rank softmax attention in Transformers via FAVOR+ random features for linear complexity, with theoretical guarantees of unbiased estimation and competitive results on pixel, text, and protein tasks.

  • Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation cs.CL · 2021-08-27 · unverdicted · none · ref 32

    ALiBi enables transformers trained on length-1024 sequences to extrapolate to length-2048 with the same perplexity as a sinusoidal model trained on 2048, while training 11% faster and using 11% less memory.