RR dominates SGD in smooth convex optimization under any reasonable stepsize after any finite number of epochs.
Mixed citations
On Information and Sufficiency,
Mixed citation behavior. Most common role is background (62%).
citation-role summary
citation-polarity summary
representative citing papers
Presents a deterministic minimax-optimal multicalibration algorithm and its generalization to outcome indistinguishability and omniprediction, resolving open questions on randomization necessity.
Manifold curvature and intrinsic dimension predict layerwise SAE width exponents and asymptotic floors across Gemma models, with cross-model transfer of the geometric regression, establishing a transferable geometric law instead of a universal scaling law.
Steered LLM activations are non-surjective: under practical assumptions, they lie outside the set of states reachable from any discrete prompt.
Direct fixed-weight solver for free-support Wasserstein medians relocates atoms using OT barycentric projections and inverse-distance weights, achieving monotone descent on smoothed objectives with fewer subproblems than nested Weiszfeld baselines.
EFE-based planning is formulated as variational free energy minimization with epistemic priors, decomposing into expected plan costs plus a complexity term.
The paper shows that arbitrage-free information pricing is computationally hard in general, provides a branch-and-bound algorithm, and proves that for threshold utilities arbitrage-freeness reduces to Blackwell dominance, unifying prior query and model pricing results.
EFE-based active inference planning is characterized as VFE on an augmented model plus entropy and planning corrections, with a derived message-passing implementation and grid-world validation.
The paper introduces a protocol-resolved framework for virological measurements, defining an observation operator that maps latent ensembles to observed data and recasting plaque assays as estimates of protocol-conditioned infectious concentration.
Proposes a scale-calibrated median-of-means estimator for robust aggregation of distributed PCA estimates on the product of Euclidean space and Grassmann manifold.
Derives the asymptotic distribution of the spatial Cramér-von Mises independence statistic under β-mixing on R² and implements it in Python with eigenvalue-based critical values.
A 10-qubit convolutional quantum graph neural network fed by autoencoder-compressed jet data achieves performance comparable to classical graph networks in distinguishing boosted Z jets from gluon jets.
A new differentiable reconstruction method uses symmetrized hyperspherical harmonics on quaternions plus two- and three-point descriptors to generate 3D microstructures from 2D data, demonstrated on aluminum alloy with L-BFGS-B optimization.
The test error of random-feature ridge regression with arbitrary data augmentation admits a closed-form asymptotic characterization in the proportional regime that depends only on population covariances and augmentation statistics.
Time-reversed Young interferometry acts as a source-space information processor where mutual information is the reciprocal invariant and source-label entropy can decrease near destructive interference while Fisher information rises.
Three new provably KL-optimal frequency normalization algorithms are presented, one running in linear time in the number of symbols.
The profile maximum likelihood estimator for the location in anisotropic hyperbolic wrapped normal models is strongly consistent, asymptotically normal, and attains the Hájek-Le Cam minimax lower bound under squared geodesic loss.
Exponentially-shifted Gaussian smoothing yields zeroth-order gradient estimators with linear dimension dependence, enabling improved complexity bounds for stochastic optimization including decision-dependent regimes.
VGF solves behavior-regularized RL by transporting particles from a reference distribution to the value-induced optimal policy via discrete value-guided gradient flow.
The normalized sum of negative log-likelihoods under sublinear parsings converges almost surely and in L1 to the entropy rate h_P for any shift-invariant measure on a finite shift space.
Statistical Linkage Learning enables a new mask construction algorithm for Partition Crossover that maintains effectiveness on noisy problems with hidden dependencies and matches noise-free performance when decomposition quality is high.
ManyIH and ManyIH-Bench address instruction conflicts in LLM agents with up to 12 privilege levels across 853 tasks, revealing frontier models achieve only ~40% accuracy.
A meta-learning method identifies the conditional mean of task-specific causal demand parameters by conditioning on all prices while masking two demand outcomes, assuming at least two locally exogenous prices per task.
Inflating the min-norm interpolator by a factor >1 reduces generalization error in linear regression with anisotropic covariances when d/n diverges to infinity.
citing papers explorer
-
Sensor Placement for Tsunami Early Warning via Large-Scale Bayesian Optimal Experimental Design
A reformulation of Bayesian OED as dense matrix subset selection plus a pipelined Schur-complement greedy algorithm on hundreds of GPUs enables optimization of 175-sensor networks for billion-degree-of-freedom tsunami models with near-perfect scaling.
-
Niching Importance Sampling for Multi-modal Rare-event Simulation
Niching importance sampling yields a robust probability-of-failure estimator that avoids degeneracy on multi-modal performance functions by integrating evolutionary niching with importance sampling.
-
QuantumXCT: Learning Interaction-Induced State Transformation in Cell-Cell Communication via Quantum Entanglement and Generative Modeling
QuantumXCT learns parameterized quantum circuits to model interaction-induced unitary transformations between non-interacting and interacting cellular state distributions from transcriptomic profiles.
-
Measuring Primitive Accumulation: An Information-Theoretic Approach to Capitalist Enclosure in PIK2, Indonesia
Satellite data projected onto a Marxian simplex shows a 0.405 rad/yr transformation pulse, 38-46 year absorption times into built land, and percolation below random thresholds indicating planned rather than stochastic urban growth in PIK2.
-
Weighted Chernoff information and optimal loss exponent in context-sensitive hypothesis testing
The optimal weighted total loss decays as exp(-n times weighted Chernoff information) when the context weight factors across observations.
-
Testing Dark Matter with Generative Models for Extragalactic Stellar Streams
X-Stream generates thousands of stream realizations in trial potentials and applies nested sampling to constrain the full radial density profile of dark matter halos from imaging data.
-
Conditional Independence of 1D Gibbs States with Applications to Efficient Learning
1D translation-invariant Gibbs states at positive temperature exhibit superexponential decay of Belavkin-Staszewski conditional mutual information, enabling efficient learning from local measurements and tensor network approximations.
-
Adaptive nonparametric regression from repeated measurements under common noise
Projection estimator minimizing a covariance-adjusted least-squares contrast for regression with common noise and repeated measurements, with risk analysis and data-driven selection.
-
EvoFlock: evolved inverse design of multi-agent motion
Genetic algorithm optimizes parameters of multi-agent flocking models to match user-defined objectives, with alignment emerging from spacing maintenance.
-
Scaling Laws for Task-Specific LLM Distillation
Empirical scaling laws for task-specific LLM distillation in quantitative finance indicate that chain-of-thought supervision recovers general knowledge lost during iterative pruning while in-domain performance degrades predictably.
-
Principal Covariate Regression with Nuclear Norm Penalty
Proposes PcovRnnp method enabling simultaneous dimension reduction and regularized coefficient estimation via nuclear norm penalty in high-dimensional settings.
-
Uncertainty Quantification of Engineering Structures by Polynomial Chaos Expansion and Multivariate Active Learning
A multivariate active learning approach for polynomial chaos expansion selects samples by aggregated output variance to improve surrogate accuracy and stability for vector-valued engineering responses.
-
Improving the Efficiency and Effectiveness of LLM Knowledge Distillation for Conversational Search
Combining contrastive loss with KLD distillation and adding sparsity regularization improves effectiveness and reduces FLOPS by 2x in conversational search with minimal recall loss.
-
Density Evolution: A Multiscale View of Density Estimation
A review reframing density estimation as 'density evolution' across scales, linking kernel smoothing to heat flow, mixtures to compression, and topology to level sets, while stating three structural results on modes, Gaussian semigroups, and log-concavity.
-
A Scalable Parametric Item Calibration Engine (SPICE) for Explanatory IRT with Sparse Data
SPICE is a scalable Bayesian MCMC engine for explanatory IRT calibration on sparsely linked persons and items in large assessment banks.
-
PRISMat: Policy-Driven, Permutation-Invariant Autoregressive Material Generation
PRISMat generates crystal slabs with mean absolute errors of 0.188 eV/A² for cleavage energy and 2.79 eV for work function, reducing error by 4× versus the next best model while using less inference time.
-
Irreversibility from Self-Reference: Gradient Flow and an H-Theorem for a Self-Referential Statistical Operator Framework
Proves an H-theorem for monotonic decrease of a convex functional under iteration and gradient flow of a self-referential operator Omega within the local kernel approximation, with perturbative stability of the Tsallis index and numerical confirmation of a re-entrant disordered phase at kappa > 0.5.
-
Quantum $f$-divergences via Nussbaum-Szko{\l}a Distributions in Semifinite von Neumann Algebras
Quantum f-divergence equals classical f-divergence of Nussbaum-Szkoła distributions for normal states on semifinite von Neumann algebras.
-
Resonance Statistics -Informed Fitting Applied to Automated Cross Section Evaluation
Resonance statistics-informed methods in automated fitting reduce spin group bias, enhance Wigner statistics consistency, and stabilize resonance density with minimal impact on cross section fit quality.
-
InsightFlow: LLM-Driven Synthesis of Patient Narratives for Mental Health into Causal Models
LLMs generate 5P causal graphs from 46 psychotherapy intake transcripts that match human expert graphs in structure and meaning, with moderate clinical usefulness ratings.
-
Sentiment Classification of Gaza War Headlines: A Comparative Analysis of Large Language Models and Arabic Fine-Tuned BERT Models
LLMs classify Gaza War headlines as strongly negative while fine-tuned Arabic BERT models favor neutral labels, producing measurable non-random divergences in sentiment distributions.
-
Stochastic simulation of partial discharge inception
Monte Carlo method simulates electron avalanches with feedback to estimate discharge inception probability and time lag per initial electron position across 2D and 3D electrode geometries.
-
Cosmic dipole tensions: confronting the cosmic microwave background with infrared and radio populations of cosmological sources
Bayesian tension analysis shows Planck CMB dipole in >5σ disagreement with CatWISE infrared sources and moderate-to-strong disagreement with radio surveys NVSS and RACS, with evidence for shared astrophysical signals in some catalogs.
-
An analysis of nuclear parton distribution function based on relative entropy
A relative-entropy method with a minimum-relative-entropy hypothesis reproduces quark nPDF shapes from global fits and indicates that EPPS21 gluon central values align more closely with the hypothesis than nNNPDF3.0.
-
Context-Aware Unit Testing for Quantum Subroutines
Proposes a context-aware unit testing framework for quantum subroutines modeled as parametrized quantum channels, using probabilistic assertions and demonstrated on GHZ preparation and Shor's algorithm subroutines.
-
Adaptive Pricing in Insurance: Generalized Linear Models and Gaussian Process Regression Approaches
Adaptive GLM with MQLE and GP regression with UCB for dynamic insurance pricing, showing parameter convergence and regret analysis under delayed claims.
-
Implying Volatility: How Fast Can We Go?
FlashIV is a new Black-Scholes implied volatility solver using input normalization, erfcx residual, and fixed Householder refinement that runs faster than Jäckel's Let's Be Rational while staying close to its reference price.
-
Software Between Quantum and Machine Learning -- And Down to Pulses
Introduces a JAX-based framework for pulse-level QML with composable ansatze, end-to-end pulse optimization, and Fourier-analytic diagnostics.
-
Function, Complexity and Thermodynamics in Adaptive and Intelligent Soft Matter Systems: An Information-Theoretical Formulation
Soft matter systems are modeled as information channels of increasing complexity, yielding a heuristic thermodynamic ceiling on information processing performance and a performance gap to biology attributed to per-element energy scales.
-
KiDS+VIKING-450 cosmology with Bayesian hierarchical model redshift distributions
Bayesian hierarchical modeling of photometric redshifts in KiDS+VIKING-450 raises S8 to 0.756 ± 0.039 and reduces Planck tension to 1.9σ.
-
Advanced Scientific Methodology Plays Rossini
Computational analysis of Rossini's multiple settings of 'Mi lagnerò tacendo' uses parsing, data mining, and graph theory to explore melodic, harmonic, and textual choices as a foundation for philological research and generative models.
-
Can LLMs Emulate Human Belief Dynamics?
LLMs fail to emulate human belief dynamics: they mismatch initial distributions and show higher conformity than humans in network interactions.
-
Almost Orthogonal Arrays: Search Three Ways
Authors apply integer programming, meta-heuristics, and algebraic techniques to generate almost orthogonal arrays that outperform prior constructions on several non-orthogonality criteria.
-
Farm-wide virtual load monitoring for offshore wind structures via Bayesian neural networks
Bayesian neural networks enable farm-wide virtual load monitoring by predicting structural loads on non-instrumented offshore wind turbines from a fleet-leader's data while quantifying prediction uncertainty.
-
A machine-learning-assisted progressive digit-randomness screening framework for detecting non-random patterns in raw numerical research data
FDRS combines digit frequency tests, association metrics, entropy, KL divergence, and ML models to assign risk grades to numerical datasets, showing separation between normal and irregular simulated data with high AUC.
-
Bayesian inference for compact binary coalescences with BILBY: Validation and application to the first LIGO--Virgo gravitational-wave transient catalogue
BILBY is validated on simulated compact binary signals and reproduces the eleven GWTC-1 results with configuration and output files provided for reproduction.
-
Experiencing Extreme Height for The First Time: The Influence of Height, Self-Judgment of Fear and a Moving Structural Beam on the Heart Rate and Postural Sway During the Quiet Stance
VR study of 12 students shows height increases postural sway, self-judged fear decreases center-of-pressure excursion, and moving beams raise heart rate in simulated steel erection.
-
Confidence intervals for the Poisson distribution
Recommends Garwood confidence intervals for summarizing Poisson sampling results after evaluating performance properties of multiple techniques.
- How to quantify direct correlations between variables
- Descent Before Hardness: Orbit-Gap Obstructions in Exact Certification