Flow matching on time series targets a closed-form nonparametric velocity field that is a similarity-weighted mixture of observed transition velocities, making neural models approximations to an ideal memory-augmented dynamical system sampler.
Attention is all you need.Advances in Neural Information Processing Systems, 30
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
A shallow dense Transformer achieves uniform epsilon-approximation of alpha-Holder functions with O(epsilon^{-d/alpha}) parameters and near-minimax generalization error O(n^{-2alpha/(2alpha+d)} log n).
Quantum diffusion models develop a distinct barren plateau beyond small qubit counts; an architectural enhancement and conditional formulation restore trainability for Hamiltonian-parameterized ground-state generation.
citing papers explorer
-
Learning Theory of Transformers: Local-to-Global Approximation via Softmax Partition of Unity
A shallow dense Transformer achieves uniform epsilon-approximation of alpha-Holder functions with O(epsilon^{-d/alpha}) parameters and near-minimax generalization error O(n^{-2alpha/(2alpha+d)} log n).