citation dossier

Approximation capabilities of multilayer feedforward networks.Neural networks, 4(2):251–257

stub below hub threshold · 2 Pith inbound

Kurt Hornik · 1991

2Pith papers citing it

2reference links

cs.LGtop field · 1 papers

UNVERDICTEDtop verdict bucket · 2 papers

This DOI or bibliographic work is known through the citation graph. Pith is enriching metadata through Crossref/OpenAlex; full non-arXiv reviews need publisher/open-access PDF resolution.

why this work matters in Pith

Pith has found this work in 2 reviewed papers. Its strongest current cluster is cs.LG (1 papers). The largest review-status bucket among citing papers is UNVERDICTED (2 papers). For highly cited works, this page shows a dossier first and a bounded explorer second; it never tries to render every citing paper at once.

representative citing papers

Learning stochastic multiscale models through normalizing flows

stat.ML · 2026-05-10 · unverdicted · novelty 7.0

A framework learns effective multiscale stochastic dynamics from single slow-variable paths by parameterizing the fast process invariant distribution with normalizing flows, trained end-to-end via penalized likelihood from stochastic averaging.

Training Transformers for KV Cache Compressibility

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

Training transformers with KV sparsification during continued pretraining produces representations that admit better post-hoc KV cache compression, improving quality under memory budgets for long-context tasks.

citing papers explorer

Showing 2 of 2 citing papers.

Learning stochastic multiscale models through normalizing flows stat.ML · 2026-05-10 · unverdicted · none · ref 12
A framework learns effective multiscale stochastic dynamics from single slow-variable paths by parameterizing the fast process invariant distribution with normalizing flows, trained end-to-end via penalized likelihood from stochastic averaging.
Training Transformers for KV Cache Compressibility cs.LG · 2026-05-07 · unverdicted · none · ref 21
Training transformers with KV sparsification during continued pretraining produces representations that admit better post-hoc KV cache compression, improving quality under memory budgets for long-context tasks.

Approximation capabilities of multilayer feedforward networks.Neural networks, 4(2):251–257

why this work matters in Pith

fields

years

verdicts

representative citing papers

citing papers explorer