Derives geodesic ridge regularization and Riemannian Gibbs Process prior for feature-learning wide neural networks, generalizing kernel-regime results via function-space axiomatization.
hub Mixed citations
Pytorch: An imperative style, high-performance deep learning library.Advances in neural information processing systems, 32
Mixed citation behavior. Most common role is background (45%).
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Subliminal learning occurs via compatible auxiliary and class output heads on task-unrelated inputs, even with random hidden layers or architecture changes, with theory and upper bounds on failure.
CRiSP uses neural-guided MCTS and curriculum learning to insert Clifford prefixes before parameterized rotations in VQAs, yielding mean 3.17x and max 45x gains in energy accuracy on 22-qubit QAOA benchmarks versus prior Clifford initializers.
SPA unlocks patch-level features in CLIP for class-incremental learning via semantic-guided selection and optimal transport alignment with class descriptions, plus projectors and pseudo-feature replay to reduce forgetting.
Spectral Energy Centroid is a new metric that quantifies signal frequency and INR spectral bias, supporting better hyperparameter selection and cross-architecture analysis.
QAP-Router models qubit routing as dynamic QAP and applies RL with a solution-aware Transformer to cut CNOT counts by 12-30% versus industry compilers on real circuit benchmarks.
LookWhen factorizes video recognition into learning when, where, and what to compute via uniqueness-based token selection and dual-teacher distillation, achieving better accuracy-FLOPs trade-offs than baselines on multiple datasets.
Sublinear neural networks parametrize convex sets by learning their support and gauge functions, backed by a universal approximation theorem and tested on shape optimization tasks.
MARS parallel reservoirs achieve up to 21x training speedups and outperform LRU, S5, and Mamba on long sequence benchmarks while remaining gradient-free and compact.
Physics-augmented neural networks act as stable, thermodynamically consistent surrogates for microscale problems, enabling simultaneous optimization of macroscale material layout and microscale descriptors in nonlinear finite-strain anisotropic hyperelastic structures.
MemDLM embeds a simulated denoising trajectory into DLM training via bi-level optimization, creating a parametric memory that improves convergence and long-context performance even when the memory is dropped at test time.
FBApro computes the nearest steady-state flux distribution to a reference vector via a closed-form linear projection derived from orthogonal projections onto affine spaces.
AVIS applies autoregressive diffusion models to video inverse problems by streaming restoration with measurement-consistent initialization, reducing latency from 114s to 4s and raising throughput to 1.18 FPS (or 5.91 FPS in the Flash variant).
GeoHand adapts priors from a general-scene geometry estimator via a GeoAdapter, gated fusion, and keypoint-queried refiner to reach SOTA monocular 3D hand reconstruction on FreiHAND, DexYCB, and HO3Dv3 under heavy occlusion.
Two methods are introduced to learn plug-in composite surrogates that maximize effect predictiveness, with the direct surrogate-effect modeling approach outperforming baselines on synthetic data with known effects and real-world experiment data.
DMI-Lib delivers 0.4-6.8% overhead for offline batch LLM inference and ~6% for moderate online serving while exposing rich internal signals across backends, cutting latency overhead 2-15x versus prior observability baselines.
DuetFair couples inter-subgroup adaptation with intra-subgroup robustness via FairDRO (dMoE plus subgroup-conditioned DRO) to boost worst-case and equity-scaled performance on medical segmentation benchmarks.
An optimal control formulation adds time-dependent perturbations to the reverse diffusion process to match target attribute distributions while preserving sample fidelity.
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
ARS shapes reasoning trace representations by clustering states that produce consistent answers and separating those that produce inconsistent ones via latent perturbations, improving plug-and-play hallucination detection without human annotations.
The paper presents stable-worldmodel (swm), a platform with high-performance data layer, modern world model baselines, planning solvers, and extended environments for reproducible research and generalization evaluation.
MahaVar augments the Mahalanobis OOD score with class-wise distance variance, which is theoretically higher for in-distribution samples under relaxed Neural Collapse geometry.
SymADiT generates stable symmetric materials by enforcing Wyckoff-position and space-group constraints inside a latent generative model built on the prior ADiT architecture.
U²AD learns unified normal data representations via score-based generative modeling and a novel time-dependent score network to outperform prior methods in accuracy and early anomaly detection for multivariate time series.
citing papers explorer
-
Canonical Regularisation of Wide Feature-Learning Neural Networks
Derives geodesic ridge regularization and Riemannian Gibbs Process prior for feature-learning wide neural networks, generalizing kernel-regime results via function-space axiomatization.
-
Learning Through Noise: Why Subliminal Learning Works and When It Fails
Subliminal learning occurs via compatible auxiliary and class output heads on task-unrelated inputs, even with random hidden layers or architecture changes, with theory and upper bounds on failure.
-
Classical State Preparation for Variational Quantum Algorithms via Reinforcement Learning
CRiSP uses neural-guided MCTS and curriculum learning to insert Clifford prefixes before parameterized rotations in VQAs, yielding mean 3.17x and max 45x gains in energy accuracy on 22-qubit QAOA benchmarks versus prior Clifford initializers.
-
Unlocking Patch-Level Features for CLIP-Based Class-Incremental Learning
SPA unlocks patch-level features in CLIP for class-incremental learning via semantic-guided selection and optimal transport alignment with class descriptions, plus projectors and pseudo-feature replay to reduce forgetting.
-
Spectral Energy Centroid: a Metric for Improving Performance and Analyzing Spectral Bias in Implicit Neural Representations
Spectral Energy Centroid is a new metric that quantifies signal frequency and INR spectral bias, supporting better hyperparameter selection and cross-architecture analysis.
-
QAP-Router: Tackling Qubit Routing as Dynamic Quadratic Assignment with Reinforcement Learning
QAP-Router models qubit routing as dynamic QAP and applies RL with a solution-aware Transformer to cut CNOT counts by 12-30% versus industry compilers on real circuit benchmarks.
-
LookWhen? Fast Video Recognition by Learning When, Where, and What to Compute
LookWhen factorizes video recognition into learning when, where, and what to compute via uniqueness-based token selection and dual-teacher distillation, achieving better accuracy-FLOPs trade-offs than baselines on multiple datasets.
-
Parametrizing Convex Sets Using Sublinear Neural Networks
Sublinear neural networks parametrize convex sets by learning their support and gauge functions, backed by a universal approximation theorem and tested on shape optimization tasks.
-
Scalable Memristive-Friendly Reservoir Computing for Time Series Classification
MARS parallel reservoirs achieve up to 21x training speedups and outperform LRU, S5, and Mamba on long sequence benchmarks while remaining gradient-free and compact.
-
Multiscale topology optimization of compressible and nearly incompressible anisotropic hyperelastic structures using physics-augmented neural networks
Physics-augmented neural networks act as stable, thermodynamically consistent surrogates for microscale problems, enabling simultaneous optimization of macroscale material layout and microscale descriptors in nonlinear finite-strain anisotropic hyperelastic structures.
-
MemDLM: Memory-Enhanced DLM Training
MemDLM embeds a simulated denoising trajectory into DLM training via bi-level optimization, creating a parametric memory that improves convergence and long-context performance even when the memory is dropped at test time.
-
FBApro: A fast, simple linear transformation for diverse metabolic modeling tasks
FBApro computes the nearest steady-state flux distribution to a reference vector via a closed-form linear projection derived from orthogonal projections onto affine spaces.
-
Accelerating Video Inverse Problem Solvers with Autoregressive Diffusion Models
AVIS applies autoregressive diffusion models to video inverse problems by streaming restoration with measurement-consistent initialization, reducing latency from 114s to 4s and raising throughput to 1.18 FPS (or 5.91 FPS in the Flash variant).
-
GeoHand: Unlocking Prior Geometry Knowledge for Monocular 3D Hand Reconstruction
GeoHand adapts priors from a general-scene geometry estimator via a GeoAdapter, gated fusion, and keypoint-queried refiner to reach SOTA monocular 3D hand reconstruction on FreiHAND, DexYCB, and HO3Dv3 under heavy occlusion.
-
Learning plug-in surrogate endpoints for randomized experiments
Two methods are introduced to learn plug-in composite surrogates that maximize effect predictiveness, with the direct surrogate-effect modeling approach outperforming baselines on synthetic data with known effects and real-world experiment data.
-
Enabling Performant and Flexible Model-Internal Observability for LLM Inference
DMI-Lib delivers 0.4-6.8% overhead for offline batch LLM inference and ~6% for moderate online serving while exposing rich internal signals across backends, cutting latency overhead 2-15x versus prior observability baselines.
-
DuetFair: Coupling Inter- and Intra-Subgroup Robustness for Fair Medical Image Segmentation
DuetFair couples inter-subgroup adaptation with intra-subgroup robustness via FairDRO (dMoE plus subgroup-conditioned DRO) to boost worst-case and equity-scaled performance on medical segmentation benchmarks.
-
Inference-Time Attribute Distribution Alignment for Unconditional Diffusion
An optimal control formulation adds time-dependent perturbations to the reverse diffusion process to match target attribute distributions while preserving sample fidelity.
-
FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
-
Harnessing Reasoning Trajectories for Hallucination Detection via Answer-agreement Representation Shaping
ARS shapes reasoning trace representations by clustering states that produce consistent answers and separating those that produce inconsistent ones via latent perturbations, improving plug-and-play hallucination detection without human annotations.
-
stable-worldmodel: A Platform for Reproducible World Modeling Research and Evaluation
The paper presents stable-worldmodel (swm), a platform with high-performance data layer, modern world model baselines, planning solvers, and extended environments for reproducible research and generalization evaluation.
-
MahaVar: OOD Detection via Class-wise Mahalanobis Distance Variance under Neural Collapse
MahaVar augments the Mahalanobis OOD score with class-wise distance variance, which is theoretically higher for in-distribution samples under relaxed Neural Collapse geometry.
-
Generating Symmetric Materials using Latent Flow Matching
SymADiT generates stable symmetric materials by enforcing Wyckoff-position and space-group constraints inside a latent generative model built on the prior ADiT architecture.
-
Learning Unified Representations of Normalcy for Time Series Anomaly Detection
U²AD learns unified normal data representations via score-based generative modeling and a novel time-dependent score network to outperform prior methods in accuracy and early anomaly detection for multivariate time series.
-
Tabular Foundation Model for Generative Modelling
TabFORGE generates high-quality synthetic tabular data by leveraging pretrained causality-aware representations in a two-stage diffusion-decoder architecture that mitigates latent distribution shifts.
-
Self-Play Enhancement via Advantage-Weighted Refinement in Online Federated LLM Fine-Tuning with Real-Time Feedback
SPEAR enables online federated LLM fine-tuning by using feedback-guided self-play to create contrastive pairs trained with maximum likelihood on correct completions and confidence-weighted unlikelihood on incorrect ones, outperforming baselines without ground-truth contexts.
-
BGM-IV: an AI-powered Bayesian generative modeling approach for instrumental variable analysis
BGM-IV performs nonlinear IV regression by inferring causally structured latent components and replacing the outcome likelihood with an instrument-averaged pseudo-likelihood, showing strongest results in high-dimensional covariate regimes.
-
Cubit: Token Mixer with Kernel Ridge Regression
Cubit replaces Transformer's attention with a closed-form Kernel Ridge Regression token mixer and reports larger gains as training sequence length increases.
-
Drivetrain simulation using variational autoencoders
Variational autoencoders generate jerk signals from torque inputs in electric drivetrains and outperform physics-based baselines without detailed parametrization.
-
Triple Configuration of Brain Networks Based on Recurrent Neural Networks: The Synergistic Effects of Exogenous Stimuli, Task Demands, and Spontaneous Activity
RNNs with dynamic constraints applied to EEG data separate brain network activity into three configurations driven by stimuli, tasks, and spontaneous processes, highlighting the parietal network as a central hub.
-
Benchmarking PyCaret AutoML Against BiLSTM for Fine-Grained Emotion Classification: A Comparative Study on 20-Class Emotion Detection
BiLSTM achieves 89% accuracy and 0.89 weighted F1 on 20-class emotion detection, marginally outperforming SVM at 88.11% on a 79,595-sentence dataset.