pith. sign in

hub

Optimizing neural networks with kronecker-factored approximate curvature

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

hub tools

citation-role summary

background 2

citation-polarity summary

years

2026 13

verdicts

UNVERDICTED 13

roles

background 2

polarities

background 2

clear filters

representative citing papers

Dead Directions: Geometric Singular Learning

cs.LG · 2026-06-04 · unverdicted · novelty 7.0

Dead directions recover Watanabe's RLCT contribution and triple (λ, m, ν) from directional Fisher curvature decay rates in original parameter space for singular models, extended via K-FAC to networks and gauge-equivariant optimizers.

Runtime-Orchestrated Second-Order Optimization for Scalable LLM Training

cs.DC · 2026-05-15 · unverdicted · novelty 6.0

Asteria is a runtime system that enables second-order optimization for LLMs by dynamically distributing optimizer state across GPU, CPU, and NVMe while using asynchronous inverse-root computations and bounded-staleness synchronization.

Compress Then Adapt? No, Do It Together via Task-aware Union of Subspaces

cs.AI · 2026-05-04 · unverdicted · novelty 6.0

JACTUS unifies low-rank compression and task adaptation via a task-aware union of subspaces and global rank allocation by marginal gain, outperforming 100% PEFT methods like DoRA on ViT-Base (89.2% avg) and Llama2-7B (80.9% avg) at 80% retained parameters.

Natural gradient descent with momentum

cs.LG · 2026-04-16 · unverdicted · novelty 5.0

Introduces natural-gradient versions of Heavy-Ball and Nesterov momentum methods for function approximation on differentiable nonlinear manifolds.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.