pith. sign in

arXiv preprint arXiv:2411.04551 , year=

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

years

2026 9 2025 1

roles

background 1

polarities

background 1

clear filters

representative citing papers

Reachability and asymptotics of Gaussian Transformer dynamics

cs.LG · 2026-05-29 · unverdicted · novelty 8.0

Gaussian distributions are invariant under the mean-field Transformer flow, reducing infinite-dimensional dynamics to a bilinear control system on mean and covariance with explicit reachability and stability results.

Transformer-like Inference from Optimal Control

cs.LG · 2026-05-15 · unverdicted · novelty 7.0

Derives transformer-like dual-filter inference layers from first-principles optimal control on nonlinear discrete and linear Gaussian sequence models.

Constructive conditional normalizing flows

math.OC · 2026-02-09 · unverdicted · novelty 7.0

Explicit constructions approximate diffeomorphisms and pushforward measures via continuity equation flows with perceptron velocity fields of piecewise constant weights, using polar-like decompositions and probabilistic methods for regular maps.

Exact Sequence Interpolation with Transformers

cs.LG · 2025-02-04 · conditional · novelty 7.0

Transformers with O(sum m^j) blocks and O(d sum m^j) parameters can exactly interpolate any finite dataset of input sequences in R^d to output sequences of lengths m^j.

Propagation of Chaos in Contextual Flow Maps

cs.LG · 2026-05-16 · unverdicted · novelty 6.0

Derives forward and backward propagation-of-chaos bounds for finite vs. infinite-context transformers modeled as contextual flow maps, achieving Wasserstein rate n^{-1/d} generally and n^{-1/2} for transformer-like cases.

citing papers explorer

Showing 6 of 6 citing papers after filters.

  • Reachability and asymptotics of Gaussian Transformer dynamics cs.LG · 2026-05-29 · unverdicted · none · ref 2

    Gaussian distributions are invariant under the mean-field Transformer flow, reducing infinite-dimensional dynamics to a bilinear control system on mean and covariance with explicit reachability and stability results.

  • Transformer-like Inference from Optimal Control cs.LG · 2026-05-15 · unverdicted · none · ref 3

    Derives transformer-like dual-filter inference layers from first-principles optimal control on nonlinear discrete and linear Gaussian sequence models.

  • Perceptrons and localization of attention's mean-field landscape cs.LG · 2026-01-29 · unverdicted · none · ref 9

    In the mean-field limit of attention with perceptron blocks, critical points of the energy landscape are generically atomic and localized on subsets of the unit sphere.

  • Exact Sequence Interpolation with Transformers cs.LG · 2025-02-04 · conditional · none · ref 12

    Transformers with O(sum m^j) blocks and O(d sum m^j) parameters can exactly interpolate any finite dataset of input sequences in R^d to output sequences of lengths m^j.

  • Propagation of Chaos in Contextual Flow Maps cs.LG · 2026-05-16 · unverdicted · none · ref 14

    Derives forward and backward propagation-of-chaos bounds for finite vs. infinite-context transformers modeled as contextual flow maps, achieving Wasserstein rate n^{-1/d} generally and n^{-1/2} for transformer-like cases.

  • Multi-Headed Transformer Architectures as Time-dependent Wasserstein Gradient Flows cs.LG · 2026-05-15 · unverdicted · none · ref 22

    Models multi-head transformer data flow as time-dependent Wasserstein gradient flows of an attention-capturing interaction energy, with proofs on omega-limit stationary points and stability under weight and input perturbations.