pith. sign in

Approximation capabilities of multilayer feedforward networks.Neural networks, 4(2):251–257

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

citation-role summary

background 2 other 1

citation-polarity summary

years

2026 4

verdicts

UNVERDICTED 4

polarities

background 2 unclear 1

clear filters

representative citing papers

Any-Dimensional Invariant Universality

cs.LG · 2026-05-22 · unverdicted · novelty 8.0

A systematic approach maps any-dimensional invariant functions to a unique function on an infinite-dimensional limit space admitting a topology with compact sets where universality holds, with examples of non-universal architectures and fixes.

Learning stochastic multiscale models through normalizing flows

stat.ML · 2026-05-10 · unverdicted · novelty 7.0

A framework learns effective multiscale stochastic dynamics from single slow-variable paths by parameterizing the fast process invariant distribution with normalizing flows, trained end-to-end via penalized likelihood from stochastic averaging.

Training Transformers for KV Cache Compressibility

cs.LG · 2026-05-07 · unverdicted · novelty 6.0 · 2 refs

Training transformers with KV sparsification during continued pretraining produces representations that admit better post-hoc KV cache compression, improving quality under memory budgets for long-context tasks.

citing papers explorer

Showing 3 of 3 citing papers after filters.

  • Any-Dimensional Invariant Universality cs.LG · 2026-05-22 · unverdicted · none · ref 33

    A systematic approach maps any-dimensional invariant functions to a unique function on an infinite-dimensional limit space admitting a topology with compact sets where universality holds, with examples of non-universal architectures and fixes.

  • Universal Approximation of Nonlinear Operators and Their Derivatives cs.LG · 2026-05-14 · unverdicted · none · ref 61

    Proves first UATs for k-times differentiable nonlinear operators and their derivatives via OL architectures uniformly on compact sets in weighted Bastiani-Sobolev spaces on general Banach spaces.

  • Training Transformers for KV Cache Compressibility cs.LG · 2026-05-07 · unverdicted · none · ref 21 · 2 links

    Training transformers with KV sparsification during continued pretraining produces representations that admit better post-hoc KV cache compression, improving quality under memory budgets for long-context tasks.