Muse: Parallel multi-scale attention for sequence to sequence learning.arXiv preprint arXiv:1911.09483

· 1911 · arXiv 1911.09483

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Key and Value Weights Are Probably All You Need: On the Necessity of the Query, Key, Value weight Triplet in Self-Attention Transformers

cs.LG · 2025-10-27 · unverdicted · novelty 7.0

One of the Q, K or V weights in transformer self-attention is redundant and replaceable by the identity matrix under mild assumptions, reducing parameters by 25 percent with no loss in small-model performance.

Structure-Guided Adaptive Propagation for Protein-Protein Interaction Site Prediction

cs.AI · 2026-06-01 · unverdicted · novelty 5.0

SGAP-PPIS generates residue-wise adaptive propagation coefficients from equivariant GNN geometric states to improve protein-protein interaction site prediction, reporting competitive results on Test_60.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Structure-Guided Adaptive Propagation for Protein-Protein Interaction Site Prediction cs.AI · 2026-06-01 · unverdicted · none · ref 32
SGAP-PPIS generates residue-wise adaptive propagation coefficients from equivariant GNN geometric states to improve protein-protein interaction site prediction, reporting competitive results on Test_60.

Muse: Parallel multi-scale attention for sequence to sequence learning.arXiv preprint arXiv:1911.09483

fields

years

verdicts

representative citing papers

citing papers explorer