DynMuon dynamically schedules the spectral exponent p in Muon-style updates according to curvature, noise, and training stage, yielding lower validation loss with 10-26% fewer steps than fixed Muon.
Canonical reference
Title resolution pending
Canonical reference. 100% of citing Pith papers cite this work as background.
citation-role summary
citation-polarity summary
roles
background 5polarities
background 5representative citing papers
Every fixed finite feedforward neural network definable in an o-minimal structure has finite sample complexity in the agnostic PAC setting.
Pre-training AeroTransformer on nearly 30,000 diverse wing geometries and fine-tuning with 450 specific samples achieves 0.36% error on surface-flow prediction for transonic wings, an 84.2% reduction versus training from scratch.
The paper proposes Retrieval Augmented Forecasting (RAF) that augments time-series foundation models with retrieved similar series to improve forecasting accuracy across domains.
S-FLM is a hyperspherical latent flow language model that learns velocity fields on the unit sphere to generate token sequences via deterministic ODE integration without materializing one-hot vectors.
PARSE trains a prompt-aware linear router on dense-model outputs to select dynamic SVD ranks, improving accuracy up to 10% at 0.6 compression ratio on LLaMA-7B while delivering 2.5x prefill and 2.4x decode speedups.
Derives closed-form optimal loss for unified diffusion models, provides variance-controlled estimators, and shows improved diagnosis, training schedules, and power-law scaling after subtracting the optimal value.
Repeated sampling scales problem coverage log-linearly with sample count, improving SWE-bench Lite performance from 15.9% to 56% using 250 samples.
Multiple Neural Operators achieve near-optimal approximation and generalization rates for multi-task operator learning, matching single-task scaling laws and performing similarly to a multi-task DeepONet extension.
Presents Instant3D for rapid text/image-to-3D generation via multi-view diffusion plus feed-forward reconstruction, and FastMap for 10x faster structure-from-motion with comparable accuracy.
Using prompts that incorporate implicature leads to responses that humans prefer 67.6% of the time over literal prompts, with larger models better at inferring intent.
This survey discusses key components and challenges for Personal LLM Agents and reviews solutions for their capability, efficiency, and security.
citing papers explorer
-
Every Feedforward Neural Network Definable in an o-Minimal Structure Has Finite Sample Complexity
Every fixed finite feedforward neural network definable in an o-minimal structure has finite sample complexity in the agnostic PAC setting.