Title resolution pending

Verma, P · 2024 · arXiv 2406.10254

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Beyond Sinusoids: A Morlet Wavelet Framework for Transformer Positional Encoding

cs.LG · 2026-05-31 · unverdicted · novelty 7.0

MoPE replaces fixed sinusoidal or rotary positional encodings with per-dimension learned Morlet wavelets that recover prior methods as limits and add a Gaussian locality kernel, yielding a 0.119 gain on TinyShakespeare when paired with energy-gated attention.

Energy-Gated Attention: Spectral Salience as an Inductive Bias for Transformer Attention

cs.LG · 2026-05-21 · unverdicted · novelty 6.0

Energy-Gated Attention improves language model validation loss by gating attention according to spectral energy of key embeddings discovered by a learned projection, with consistent gains on TinyShakespeare and Penn Treebank using under 0.26% extra parameters.

Energy-Gated Attention and Wavelet Positional Encoding: Complementary Inductive Biases for Transformer Attention

cs.LG · 2026-05-25 · unverdicted · novelty 5.0

EGA and MoPE together yield a 0.119 validation loss improvement on TinyShakespeare that exceeds the sum of their individual effects, indicating complementary inductive biases for salience and locality.

Multiscale POD of Transformer Attention Fields: Scale-Selective Analysis via Morlet Scalogram

physics.flu-dyn · 2026-06-04 · unverdicted · novelty 4.0

Applies multiscale POD with Morlet scalograms to transformer attention fields to extract dominant modes per scale and reports layer-dependent scale organisation.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Beyond Sinusoids: A Morlet Wavelet Framework for Transformer Positional Encoding cs.LG · 2026-05-31 · unverdicted · none · ref 11
MoPE replaces fixed sinusoidal or rotary positional encodings with per-dimension learned Morlet wavelets that recover prior methods as limits and add a Gaussian locality kernel, yielding a 0.119 gain on TinyShakespeare when paired with energy-gated attention.
Energy-Gated Attention: Spectral Salience as an Inductive Bias for Transformer Attention cs.LG · 2026-05-21 · unverdicted · none · ref 19
Energy-Gated Attention improves language model validation loss by gating attention according to spectral energy of key embeddings discovered by a learned projection, with consistent gains on TinyShakespeare and Penn Treebank using under 0.26% extra parameters.
Energy-Gated Attention and Wavelet Positional Encoding: Complementary Inductive Biases for Transformer Attention cs.LG · 2026-05-25 · unverdicted · none · ref 21
EGA and MoPE together yield a 0.119 validation loss improvement on TinyShakespeare that exceeds the sum of their individual effects, indicating complementary inductive biases for salience and locality.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer