Mixed citations

A Stochastic Approximation Method,

doi: 10 · 1951 · arXiv aoms/1177729

Mixed citation behavior. Most common role is background (62%).

94 Pith papers citing it

Background 62% of classified citations

read on arXiv browse 94 citing papers

citation-role summary

background 10 method 3

citation-polarity summary

background 8 use method 3 unclear 2

representative citing papers

Random Reshuffling Dominates Stochastic Gradient Descent

math.OC · 2026-06-30 · unverdicted · novelty 8.0

RR dominates SGD in smooth convex optimization under any reasonable stepsize after any finite number of epochs.

Optimal Deterministic Multicalibration and Omniprediction

cs.LG · 2026-06-18 · unverdicted · novelty 8.0

Presents a deterministic minimax-optimal multicalibration algorithm and its generalization to outcome indistinguishability and omniprediction, resolving open questions on randomization necessity.

The Geometric Wall: Manifold Structure Predicts Layerwise Sparse Autoencoder Scaling Laws

cs.LG · 2026-05-11 · unverdicted · novelty 8.0

Manifold curvature and intrinsic dimension predict layerwise SAE width exponents and asymptotic floors across Gemma models, with cross-model transfer of the geometric regression, establishing a transferable geometric law instead of a universal scaling law.

Steered LLM Activations are Non-Surjective

cs.AI · 2026-04-10 · unverdicted · novelty 8.0 · 2 refs

Steered LLM activations are non-surjective: under practical assumptions, they lie outside the set of states reachable from any discrete prompt.

Quantum-optimal coronagraphy with spatial mode sorting for direct exoplanet observations

astro-ph.IM · 2026-07-02 · unverdicted · novelty 7.0

The paper derives quantum-optimal spatial modes for mode-sorting coronagraphy that account for finite star size and complex apertures, improving detection performance at close working angles.

Fast Computation of Free-Support Wasserstein Medians

stat.CO · 2026-06-17 · unverdicted · novelty 7.0

Direct fixed-weight solver for free-support Wasserstein medians relocates atoms using OT barycentric projections and inverse-distance weights, achieving monotone descent on smoothed objectives with fewer subproblems than nested Weiszfeld baselines.

Expected Free Energy-based Planning as Variational Inference

cs.AI · 2026-06-09 · unverdicted · novelty 7.0

EFE-based planning is formulated as variational free energy minimization with epistemic priors, decomposing into expected plan costs plus a complexity term.

Arbitrage-free Data Pricing

cs.GT · 2026-06-09 · unverdicted · novelty 7.0

The paper shows that arbitrage-free information pricing is computationally hard in general, provides a branch-and-bound algorithm, and proves that for threshold utilities arbitrage-freeness reduces to Blackwell dominance, unifying prior query and model pricing results.

What Type of Inference is Active Inference?

cs.AI · 2026-06-03 · unverdicted · novelty 7.0

EFE-based active inference planning is characterized as VFE on an augmented model plus entropy and planning corrections, with a derived message-passing implementation and grid-world validation.

Experimental Collapse in Virophysics: Protocol-Resolved Observation, Inference, and Plaque-Assay Blindness

physics.bio-ph · 2026-05-27 · unverdicted · novelty 7.0

The paper introduces a protocol-resolved framework for virological measurements, defining an observation operator that maps latent ensembles to observed data and recasting plaque assays as estimates of protocol-conditioned infectious concentration.

Scale-Calibrated Median-of-Means for Robust Distributed Principal Component Analysis

stat.ME · 2026-05-20 · unverdicted · novelty 7.0

Proposes a scale-calibrated median-of-means estimator for robust aggregation of distributed PCA estimates on the product of Euclidean space and Grassmann manifold.

The Spatial Cram'{e}r--von Mises Test of Independence under $\beta$-Mixing: Asymptotic Theory and Python Implementation

stat.ME · 2026-05-18 · unverdicted · novelty 7.0

Derives the asymptotic distribution of the spatial Cramér-von Mises independence statistic under β-mixing on R² and implements it in Python with eigenvalue-based critical values.

Quantum enhanced identification of boosted jets with quantum graph neural networks

hep-ph · 2026-05-18 · unverdicted · novelty 7.0

A 10-qubit convolutional quantum graph neural network fed by autoencoder-compressed jet data achieves performance comparable to classical graph networks in distinguishing boosted Z jets from gluon jets.

Generative reconstruction of 2D and 3D polycrystalline microstructures using symmetrized hyperspherical harmonics

cond-mat.mtrl-sci · 2026-05-14 · unverdicted · novelty 7.0

A new differentiable reconstruction method uses symmetrized hyperspherical harmonics on quaternions plus two- and three-point descriptors to generate 3D microstructures from 2D data, demonstrated on aluminum alloy with L-BFGS-B optimization.

Characterizing the Generalization Error of Random Feature Regression with Arbitrary Data-Augmentation

stat.ML · 2026-05-11 · conditional · novelty 7.0

The test error of random-feature ridge regression with arbitrary data augmentation admits a closed-form asymptotic characterization in the proportional regime that depends only on population covariances and augmentation statistics.

Entropic Reciprocity in Time-Reversed Young Interferometry

quant-ph · 2026-05-01 · unverdicted · novelty 7.0

Time-reversed Young interferometry acts as a source-space information processor where mutual information is the reciprocal invariant and source-label entropy can decrease near destructive interference while Fisher information rises.

Fast and Exact: Asymptotically Linear KL-Optimal Frequency Normalization

cs.IT · 2026-05-01 · unverdicted · novelty 7.0

Three new provably KL-optimal frequency normalization algorithms are presented, one running in linear time in the number of symbols.

Profile Likelihood Inference for Anisotropic Hyperbolic Wrapped Normal Models on Hyperbolic Space

math.ST · 2026-05-01 · unverdicted · novelty 7.0

The profile maximum likelihood estimator for the location in anisotropic hyperbolic wrapped normal models is strongly consistent, asymptotically normal, and attains the Hájek-Le Cam minimax lower bound under squared geodesic loss.

Complexity Guarantees for Zeroth-order Methods via Exponentially-shifted Gaussian Smoothing: Mitigating Dimension-dependence and Incorporating Decision-dependence

math.OC · 2026-04-16 · unverdicted · novelty 7.0

Exponentially-shifted Gaussian smoothing yields zeroth-order gradient estimators with linear dimension dependence, enabling improved complexity bounds for stochastic optimization including decision-dependent regimes.

Reinforcement Learning via Value Gradient Flow

cs.LG · 2026-04-15 · unverdicted · novelty 7.0

VGF solves behavior-regularized RL by transporting particles from a reference distribution to the value-induced optimal policy via discrete value-guided gradient flow.

Stability of the Shannon--McMillan--Breiman Theorem under Sublinear Parsings

cs.IT · 2026-04-15 · unverdicted · novelty 7.0

The normalized sum of negative log-likelihoods under sublinear parsings converges almost surely and in L1 to the entropy rate h_P for any shift-invariant measure on a finite shift space.

Obtaining Partition Crossover masks using Statistical Linkage Learning for solving noised optimization problems with hidden variable dependency structure

stat.ML · 2026-04-13 · unverdicted · novelty 7.0

Statistical Linkage Learning enables a new mask construction algorithm for Partition Crossover that maintains effectiveness on noisy problems with hidden dependencies and matches noise-free performance when decomposition quality is high.

Many-Tier Instruction Hierarchy in LLM Agents

cs.CL · 2026-04-10 · unverdicted · novelty 7.0

ManyIH and ManyIH-Bench address instruction conflicts in LLM agents with up to 12 privilege levels across 853 tasks, revealing frontier models achieve only ~40% accuracy.

Causal Multi-Task Demand Learning

cs.LG · 2026-02-10 · unverdicted · novelty 7.0

A meta-learning method identifies the conditional mean of task-specific causal demand parameters by conditioning on all prices while masking two demand outcomes, assuming at least two locally exogenous prices per task.

citing papers explorer

Showing 50 of 94 citing papers.

Random Reshuffling Dominates Stochastic Gradient Descent math.OC · 2026-06-30 · unverdicted · none · ref 25
RR dominates SGD in smooth convex optimization under any reasonable stepsize after any finite number of epochs.
Optimal Deterministic Multicalibration and Omniprediction cs.LG · 2026-06-18 · unverdicted · none · ref 4
Presents a deterministic minimax-optimal multicalibration algorithm and its generalization to outcome indistinguishability and omniprediction, resolving open questions on randomization necessity.
The Geometric Wall: Manifold Structure Predicts Layerwise Sparse Autoencoder Scaling Laws cs.LG · 2026-05-11 · unverdicted · none · ref 40
Manifold curvature and intrinsic dimension predict layerwise SAE width exponents and asymptotic floors across Gemma models, with cross-model transfer of the geometric regression, establishing a transferable geometric law instead of a universal scaling law.
Steered LLM Activations are Non-Surjective cs.AI · 2026-04-10 · unverdicted · none · ref 4 · 2 links
Steered LLM activations are non-surjective: under practical assumptions, they lie outside the set of states reachable from any discrete prompt.
Quantum-optimal coronagraphy with spatial mode sorting for direct exoplanet observations astro-ph.IM · 2026-07-02 · unverdicted · none · ref 18
The paper derives quantum-optimal spatial modes for mode-sorting coronagraphy that account for finite star size and complex apertures, improving detection performance at close working angles.
Fast Computation of Free-Support Wasserstein Medians stat.CO · 2026-06-17 · unverdicted · none · ref 247
Direct fixed-weight solver for free-support Wasserstein medians relocates atoms using OT barycentric projections and inverse-distance weights, achieving monotone descent on smoothed objectives with fewer subproblems than nested Weiszfeld baselines.
Expected Free Energy-based Planning as Variational Inference cs.AI · 2026-06-09 · unverdicted · none · ref 166
EFE-based planning is formulated as variational free energy minimization with epistemic priors, decomposing into expected plan costs plus a complexity term.
Arbitrage-free Data Pricing cs.GT · 2026-06-09 · unverdicted · none · ref 8
The paper shows that arbitrage-free information pricing is computationally hard in general, provides a branch-and-bound algorithm, and proves that for threshold utilities arbitrage-freeness reduces to Blackwell dominance, unifying prior query and model pricing results.
What Type of Inference is Active Inference? cs.AI · 2026-06-03 · unverdicted · none · ref 181
EFE-based active inference planning is characterized as VFE on an augmented model plus entropy and planning corrections, with a derived message-passing implementation and grid-world validation.
Experimental Collapse in Virophysics: Protocol-Resolved Observation, Inference, and Plaque-Assay Blindness physics.bio-ph · 2026-05-27 · unverdicted · none · ref 28
The paper introduces a protocol-resolved framework for virological measurements, defining an observation operator that maps latent ensembles to observed data and recasting plaque assays as estimates of protocol-conditioned infectious concentration.
Scale-Calibrated Median-of-Means for Robust Distributed Principal Component Analysis stat.ME · 2026-05-20 · unverdicted · none · ref 82
Proposes a scale-calibrated median-of-means estimator for robust aggregation of distributed PCA estimates on the product of Euclidean space and Grassmann manifold.
The Spatial Cram'{e}r--von Mises Test of Independence under $\beta$-Mixing: Asymptotic Theory and Python Implementation stat.ME · 2026-05-18 · unverdicted · none · ref 1
Derives the asymptotic distribution of the spatial Cramér-von Mises independence statistic under β-mixing on R² and implements it in Python with eigenvalue-based critical values.
Quantum enhanced identification of boosted jets with quantum graph neural networks hep-ph · 2026-05-18 · unverdicted · none · ref 28
A 10-qubit convolutional quantum graph neural network fed by autoencoder-compressed jet data achieves performance comparable to classical graph networks in distinguishing boosted Z jets from gluon jets.
Generative reconstruction of 2D and 3D polycrystalline microstructures using symmetrized hyperspherical harmonics cond-mat.mtrl-sci · 2026-05-14 · unverdicted · none · ref 79
A new differentiable reconstruction method uses symmetrized hyperspherical harmonics on quaternions plus two- and three-point descriptors to generate 3D microstructures from 2D data, demonstrated on aluminum alloy with L-BFGS-B optimization.
Characterizing the Generalization Error of Random Feature Regression with Arbitrary Data-Augmentation stat.ML · 2026-05-11 · conditional · none · ref 39
The test error of random-feature ridge regression with arbitrary data augmentation admits a closed-form asymptotic characterization in the proportional regime that depends only on population covariances and augmentation statistics.
Entropic Reciprocity in Time-Reversed Young Interferometry quant-ph · 2026-05-01 · unverdicted · none · ref 37
Time-reversed Young interferometry acts as a source-space information processor where mutual information is the reciprocal invariant and source-label entropy can decrease near destructive interference while Fisher information rises.
Fast and Exact: Asymptotically Linear KL-Optimal Frequency Normalization cs.IT · 2026-05-01 · unverdicted · none · ref 2
Three new provably KL-optimal frequency normalization algorithms are presented, one running in linear time in the number of symbols.
Profile Likelihood Inference for Anisotropic Hyperbolic Wrapped Normal Models on Hyperbolic Space math.ST · 2026-05-01 · unverdicted · none · ref 14
The profile maximum likelihood estimator for the location in anisotropic hyperbolic wrapped normal models is strongly consistent, asymptotically normal, and attains the Hájek-Le Cam minimax lower bound under squared geodesic loss.
Complexity Guarantees for Zeroth-order Methods via Exponentially-shifted Gaussian Smoothing: Mitigating Dimension-dependence and Incorporating Decision-dependence math.OC · 2026-04-16 · unverdicted · none · ref 37
Exponentially-shifted Gaussian smoothing yields zeroth-order gradient estimators with linear dimension dependence, enabling improved complexity bounds for stochastic optimization including decision-dependent regimes.
Reinforcement Learning via Value Gradient Flow cs.LG · 2026-04-15 · unverdicted · none · ref 34
VGF solves behavior-regularized RL by transporting particles from a reference distribution to the value-induced optimal policy via discrete value-guided gradient flow.
Stability of the Shannon--McMillan--Breiman Theorem under Sublinear Parsings cs.IT · 2026-04-15 · unverdicted · none · ref 10
The normalized sum of negative log-likelihoods under sublinear parsings converges almost surely and in L1 to the entropy rate h_P for any shift-invariant measure on a finite shift space.
Obtaining Partition Crossover masks using Statistical Linkage Learning for solving noised optimization problems with hidden variable dependency structure stat.ML · 2026-04-13 · unverdicted · none · ref 10
Statistical Linkage Learning enables a new mask construction algorithm for Partition Crossover that maintains effectiveness on noisy problems with hidden dependencies and matches noise-free performance when decomposition quality is high.
Many-Tier Instruction Hierarchy in LLM Agents cs.CL · 2026-04-10 · unverdicted · none · ref 41
ManyIH and ManyIH-Bench address instruction conflicts in LLM agents with up to 12 privilege levels across 853 tasks, revealing frontier models achieve only ~40% accuracy.
Causal Multi-Task Demand Learning cs.LG · 2026-02-10 · unverdicted · none · ref 9
A meta-learning method identifies the conditional mean of task-specific causal demand parameters by conditioning on all prices while masking two demand outcomes, assuming at least two locally exogenous prices per task.
Shrinkage to Infinity: Reducing Test Error by Inflating the Minimum Norm Interpolator in Linear Models math.ST · 2025-10-22 · unverdicted · none · ref 11
Inflating the min-norm interpolator by a factor >1 reduces generalization error in linear regression with anisotropic covariances when d/n diverges to infinity.
Inherited or produced? Inferring protein production kinetics when protein counts are shaped by a cell's division history q-bio.QM · 2025-06-11 · unverdicted · none · ref 65
Conditional normalizing flows approximate intractable likelihoods arising from cell division history to conclude that glc3 is mostly inactive under nutrient stress in yeast, with brief transient expression.
Cutoff for mixtures of permuted Markov chains: reversible case math.PR · 2024-01-08 · unverdicted · none · ref 41
Proves cutoff at entropic time log n/h for reversible mixtures of permuted Markov chains under mild assumptions on the base chains.
SCAPE: Accurate and Efficient LLM Training with Extreme Sparse Communication cs.LG · 2026-07-02 · conditional · none · ref 30
SCAPE enables 90-99% sparse gradient communication in sharded Adam-style LLM training by deriving masks from first-moment statistics, achieving up to 43.3% faster pre-training on Llama-500M with no loss in validation loss or downstream accuracy.
J- and MJ-Type Tests for Non-Nested Parametric Survival Models with a Cure Fraction: A Score Test Approach stat.ME · 2026-07-01 · unverdicted · none · ref 10
Proposes score-test-based J and MJ statistics for non-nested cure-fraction survival models that reduce to a Vuong-like form for two models while extending to M models and providing a model-selection criterion.
Worst-Case Maximal Inequalities for Heavy-tailed Random Vectors math.ST · 2026-06-30 · unverdicted · none · ref 6
The paper characterizes the worst-case expected top-k norm of sample averages for heavy-tailed vectors up to universal constants under envelope moment conditions.
A Quantum-Classical Surrogate Model for the Collision Operator of the Lattice Boltzmann Method quant-ph · 2026-06-30 · unverdicted · none · ref 54
A quantum machine learning surrogate based on parameterized circuits with data re-uploading approximates the full BGK collision dynamics in LBM across all admissible relaxation parameters and is validated on Taylor-Green vortex and double shear layer benchmarks.
Spatio-Temporal Disaggregation with Changing Areal Boundaries stat.ME · 2026-06-23 · unverdicted · none · ref 28
A spatio-temporal disaggregation method that replaces lognormal polygon effects with gamma overdispersion to obtain a marginal negative binomial likelihood, reducing latent variables and enabling fast inference via the Extended Latent Gaussian Model framework.
A generalized multiple-intervention stepped wedge design framework for treatment effect estimation in the presence of non-uniform cluster-period correlation structures stat.ME · 2026-06-22 · unverdicted · none · ref 110 · 2 links
Develops a unified covariance framework for M-SWDs that accommodates non-uniform cluster-period correlations while preserving closed-form variance expressions for treatment effect estimators.
Coupling-Grouped XY-QAOA for Joint Anomaly-Feature Selection quant-ph · 2026-06-11 · unverdicted · none · ref 30
Coupling-Grouped XY-QAOA enables joint anomaly-feature selection via a constraint-preserving grouped-angle QAOA variant, achieving 45.9-61.3% circuit depth reduction and larger feasible executions (64 qubits at p=2) on IBM Heron hardware compared to standard approaches.
Dark Energy Survey Year 3 results: optimized $w$CDM simulation-based inference with weak lensing map-level hybrid statistics astro-ph.CO · 2026-06-09 · unverdicted · none · ref 18
DES Y3 weak lensing analysis with hybrid map-level statistics and simulation-based inference yields S8 = 0.808 ± 0.017, Ωm = 0.325 ± 0.024, and w < -0.766, improving the figure of merit by 60% over prior state-of-the-art.
A stochastic gradient algorithm for non-separable optimization with convergence guarantee math.OC · 2026-06-09 · unverdicted · none · ref 36
Presents a stochastic gradient algorithm for non-separable optimization with local convergence guarantees under smoothness assumptions.
Anchor PCA stat.ML · 2026-06-04 · unverdicted · none · ref 35
Anchor PCA recovers a maximal invariant subspace for multi-domain data via PCA on a modified target matrix that trades off explained variance with domain agreement.
Optimal experimental design for passive imaging source problems math.OC · 2026-06-03 · unverdicted · none · ref 33
A two-level low-rank approximation enables scalable A-optimal sensor design for passive imaging without repeated PDE solves in the online phase.
Optimal sequential two-stage Bayes Factor Design for two-arm clinical Phase II Trials with binary Endpoints stat.ME · 2026-06-01 · unverdicted · none · ref 47
Derives exact operating characteristic corrections and a numerical search over sample sizes to obtain optimal two-stage Bayes factor designs for two-arm binary-endpoint phase II trials that minimize expected sample size under the null.
The Nonparametric Kiefer-Weiss Problem math.ST · 2026-05-29 · unverdicted · none · ref 5
The nonparametric Kiefer-Weiss problem is solved by deriving an optimal stopping policy based on a two-dimensional statistic (likelihood ratio plus expected remaining sample size) whose randomization rule maps the likelihood ratio to an integer sample size.
A General Recipe for Parameter-Free Nonconvex Optimization via Higher-Order Regularization math.OC · 2026-05-29 · unverdicted · none · ref 58
A general framework for parameter-free smooth nonconvex optimization via higher-order regularization yields algorithms with optimal complexity bounds without prior parameter knowledge.
Faster Monotone Implied Volatility Solver q-fin.CP · 2026-05-21 · unverdicted · none · ref 7 · 2 links
ThiopheneIV is a monotone implied-volatility solver using Choi-Huh-Su seed, Euler-Chebyshev and Halley iterations, proven to converge monotonically in exact arithmetic, with double-precision boundary handling and comparisons to Jäckel's solver.
Hyper-V2X: Hypernetworks for Estimating Epistemic and Aleatoric Uncertainty in Cooperative Bird's-Eye-View Semantic Segmentation cs.CV · 2026-05-20 · unverdicted · none · ref 35
Hyper-V2X uses a Bayesian hypernetwork with partial weight generation and V2X context embedding to produce calibrated epistemic and aleatoric uncertainty estimates for multi-agent BEV segmentation on the OPV2V benchmark.
SMA-DP: Spectral Memory-Aware Differential Privacy for Deep Learning cs.LG · 2026-05-19 · unverdicted · none · ref 16
SMA-DP-SGD augments DP-SGD with a spectral memory-aware fractional branch from prior privatized updates to improve accuracy on CIFAR and MNIST while preserving conditional differential privacy.
When Outcome Looks Right But Discipline Fails: Trace-Based Evaluation Under Hidden Competitor State cs.AI · 2026-05-18 · unverdicted · none · ref 7
The paper introduces discipline stability, a trace-based evaluation paradigm for checking if RL agents maintain behavioral discipline like rule-based competitors in hidden-state competitive settings such as hotel pricing and bidding.
Accelerating charging dynamics of electric double-layer capacitors cond-mat.soft · 2026-05-18 · unverdicted · none · ref 43
Derives time-dependent voltage protocols that eliminate an arbitrary number of relaxation modes to accelerate charging and discharging of planar EDLCs in finite time shorter than intrinsic relaxation timescales.
Revisiting the Adam-SGD Gap in LLM Pre-Training: The Role of Large Effective Learning Rates cs.LG · 2026-05-18 · unverdicted · none · ref 8
The Adam-SGD gap in large-batch LLM pre-training arises mainly from SGD's restricted effective learning rates caused by small gradients and output-layer spikes; clipping lets SGD recover nearly all of Adam's performance.
Analogical Trajectory Transfer cs.CV · 2026-05-14 · conditional · none · ref 46
A method transfers trajectories across 3D scenes by clustering objects, predicting hierarchical smooth maps from foundation model features, assembling them combinatorially, and refining for coherence.
Scale selection for geometric medians on product manifolds math.ST · 2026-05-08 · unverdicted · none · ref 62
Joint location-scale minimization for geometric medians on product manifolds degenerates to marginal medians, and three new scale-selection methods restore identifiability with asymptotic guarantees.
The Endogeneity of Miscalibration: Impossibility and Escape in Scored Reporting cs.GT · 2026-05-08 · unverdicted · none · ref 12
Non-affine approval functions create unavoidable miscalibration in proper scoring rules for strategic agents, but step-function thresholds enable first-best screening without it, uniquely for the Brier score.

A Stochastic Approximation Method,

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer