pith. machine review for the scientific record. sign in

KV-weights are all you need for skipless transformers.arXiv preprint arXiv:2404.12362

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.LG 1

years

2026 1

verdicts

ACCEPT 1

representative citing papers

Can an MLP Absorb Its Own Skip Connection?

cs.LG · 2026-04-26 · accept · novelty 7.0

Skip-connected MLPs and residual-free MLPs of equal width represent generically disjoint function classes for common activations, with explicit impossibility proofs and a non-generic absorption condition for ReLU and GELU.

citing papers explorer

Showing 1 of 1 citing paper.

  • Can an MLP Absorb Its Own Skip Connection? cs.LG · 2026-04-26 · accept · none · ref 4

    Skip-connected MLPs and residual-free MLPs of equal width represent generically disjoint function classes for common activations, with explicit impossibility proofs and a non-generic absorption condition for ReLU and GELU.