{"total":25,"items":[{"citing_arxiv_id":"2605.23697","ref_index":23,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Noise and Configuration Recovery Impact on Quantum Selected Configuration Interaction","primary_cat":"quant-ph","submitted_at":"2026-05-22T14:50:38+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"Noise in LUCJ sampling for QSCI on N2 expands the configuration space beyond the ideal ansatz and, when paired with recovery, produces more accurate CI energies than noiseless sampling.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.19951","ref_index":10,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Scalable parallel 3-D TEM inversion via rational approximation of the matrix exponential","primary_cat":"math.NA","submitted_at":"2026-05-19T15:05:03+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A Gauss-Newton-based parallel 3-D TEM inversion method employs rational near-best approximations of the matrix exponential to make time-dependent computations independent of the number of observation times.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.19896","ref_index":52,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Adaptive Reduced-Basis Trust-Region Methods for Defect Identification in Elastic Materials","primary_cat":"math.NA","submitted_at":"2026-05-19T14:27:17+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.18210","ref_index":22,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Motion-Enabled Tomography via Gaussian Mixture Models","primary_cat":"math.NA","submitted_at":"2026-05-18T10:56:00+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A parametric GMM model for motion-enabled tomography that decouples reconstruction into sub-problems and tests on 2D simulations of intersecting trajectories.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.12631","ref_index":66,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Bridging perturbation and variational approaches in brittle fracture","primary_cat":"cond-mat.mtrl-sci","submitted_at":"2026-05-12T18:12:55+00:00","verdict":"CONDITIONAL","verdict_confidence":"MODERATE","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A variational reduced-order model bridges perturbation and variational fracture approaches to simulate coplanar 3D crack propagation in heterogeneous brittle solids, uncovering size-dependent weakening-to-toughening crossovers driven by depinning instabilities.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.09176","ref_index":29,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Navigating LLM Valley: From AdamW to Memory-Efficient and Matrix-Based Optimizers","primary_cat":"cs.LG","submitted_at":"2026-05-09T21:34:28+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"This survey organizes LLM optimizer literature into categories and argues the field is shifting toward rigorous, multi-factor comparisons of convergence, memory, stability, and complexity.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"to ordinary data parallelism, which replicates parameters, gradients, and optimizer states on every worker, sharded training distributes some or all of these tensors across workers and reconstructs them only when needed for computation. ZeRO introduced a staged approach to remove redundancy in optimizer states, gradients, and parameters, enabling much larger models while preserving data-parallel training semantics [ 29]. PyTorch Fully Sharded Data Parallel (FSDP) brings similar ideas into a native PyTorch implementation, sharding parameters, gradients, and optimizer states while integrating with autograd, CUDA memory management, prefetching, and communication-computation overlap [53]. In parallel, Megatron-style tensor and pipeline parallelism shard individual layers and sequences across"},{"citing_arxiv_id":"2605.07038","ref_index":46,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Learning Material-Aware Hamiltonian Risk Fields for Safe Navigation","primary_cat":"cs.LG","submitted_at":"2026-05-07T23:33:34+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"A learned context-energy term in port-Hamiltonian policies creates selective risk navigation that activates evasive forces only when safer paths are available.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.06868","ref_index":37,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"When Descent Is Too Stable: Event-Triggered Hamiltonian Learning to Optimize","primary_cat":"cs.LG","submitted_at":"2026-05-07T19:10:48+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"SHAPE lifts gradient descent to an augmented phase space with a learned Hamiltonian vector field and event-triggered port updates to balance descent, exploitation, and exploration, improving best-so-far performance over fixed-policy methods in nonconvex tasks.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"A first-order optimizer may be written abstractly as qk+1 =G(q k, g(qk);ψ), where g(qk) is the acquired local force, equal to ∇f(q k) in the clean-gradient case, and ψ denotes either hand-tuned hyperparameter or learned update-rule parameters. Descent update rule G is proved to be stable and can find the desired global minima if f is convex or strongly convex [37, 42, 26]. Nevertheless, such a form hides a central difficulty of nonconvex optimization under a limited budget: local descent can betoo stable. Once the update reaches a nearby attractive critical point, the remaining evaluations may be spent refining a basin that is irrelevant to the best solution available under the same budget. This failure mode suggests a navigation view of optimization."},{"citing_arxiv_id":"2605.06392","ref_index":21,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"ADELIA: Automatic Differentiation for Efficient Laplace Inference Approximations","primary_cat":"cs.DC","submitted_at":"2026-05-07T15:07:31+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"ADELIA is the first AD-enabled INLA system that computes exact hyperparameter gradients via a structure-exploiting multi-GPU backward pass, delivering 4.2-7.9x per-gradient speedups and 5-8x better energy efficiency than finite differences on models with up to 1.9 million latent variables.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"sequentially, only one set is live at a time, giving a peak memory footprint ofn·b 2 (carries) plusO(b 2)per-block workspace for the reconstructed factors, independent of the number of intermediates that naive AD would store. a) Phase A - Selected inversion gradients for log|Q c|:The gradient of a log-determinant is ∂log|Q| ∂θk = tr \u0010 Q−1 ∂Q ∂θk \u0011 [15], [21]. Naively, this requires the full inverseQ −1, which is dense and costsO(N 3)to compute. However, expanding the trace of a product gives tr(AB) = P ij AijBji, which reduces to an element-wise sum without forming the full productAB. Since bothQ −1 and ∂Q ∂θk are symmetric and the latter has the same BTA sparsity pattern asQ, every term with a zero entry in ∂Q"},{"citing_arxiv_id":"2605.06141","ref_index":11,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Matrix-Valued Optimism is Matrix-Valued Augmentation: Additive Hybrid Designs for Constrained Optimization","primary_cat":"cs.LG","submitted_at":"2026-05-07T12:39:22+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Matrix-valued optimism equals matrix-valued augmentation additively for symmetric parameters, enabling closed-form hybrid designs that improve finite-step feasibility in constrained optimization.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.05284","ref_index":28,"ref_count":2,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Direct From Darwin: Deriving Advanced Optimizers From Evolutionary First Principles","primary_cat":"cs.NE","submitted_at":"2026-05-06T17:33:58+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"SGD, approximations of Newton's method, natural gradient descent, and Adam are proven compatible with evolutionary dynamics when augmented with DLS noise, turning them into valid in silico simulations of asexual Darwinian evolution.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"This is just a restatement of part of Theorem 2 and the above discussion. It is instructive to begin by considering the simplest form of anisotropy: diagonal pre-conditioning. A foundational example in the machine learning literature is the RMSProp algorithm (a precursor to Adam). RMSProp tracks an exponential moving average of the squared gradients for each parameter,s g+1 =β 2sg + (1−β 2)f 2 g , where β2 ∈[0,1)controls the decay rate andf 2 g is the element-wise square of the gradient, fg =∇log(F λ)(ϕg). Depending on hows g is initialized (particularly, ifs 0 = 0) it may be a biased estimator of the problem's true second moments. This can be corrected by usingˆsg =s g/(1−β g 2)instead. It should also be noted that in our indexing convention, the quantitys g represents the average second moments of thepast gradients, not yet includingf 2 g . At each step, the RMSProp algorithm rescales the current gradient,f g, coordinate- wise by the inverse square root of this average as, Vg fg =diag   αq sg+1/(1−β g+1 2 ) +ϵ   fg.(29) for some base learning rate,α, with some smallϵincluded for numerical stability. In effect, this pre-conditioning implements a rescaling of each coordinate so as to make the level sets of the objective function more \"rounded\", making local minima more accessible. Notice, however, that there is a minor issue with implementing RMSProp within the DLS framework,V g depends upons g+1 and hence onf g. Recall from above that this is not allowed except atg= 0. The fix, however, is simple: 21 we can instead use Vg fg =diag   αq sg/(1−β g+1 2 ) +ϵ   fg.(30) Note that we have not reduced the index on the(1−β g+1 2 )factor. One must then initializes 0 to yield the user's specification forV 0, the lineage's initial (diagonal) vari- ance. A very natural choice would bes 0 = (1−β 2)f 2 0 from which it follows that s0/(1−β 2) =s 1/(1−β 2 2) =f 2 0 . This initialization fors 0 keeps the pre-conditioner perfectly stable through the early updates wh"},{"citing_arxiv_id":"2605.04735","ref_index":41,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Sequential topology optimization: SIMP initialization for level-set boundary refinement","primary_cat":"cs.CE","submitted_at":"2026-05-06T10:33:17+00:00","verdict":"CONDITIONAL","verdict_confidence":"MODERATE","novelty_score":4.0,"formal_verification":"none","one_line_summary":"A sequential topology optimization approach uses SIMP results to initialize level-set refinement via signed distance function transfer on 3D meshes, achieving comparable compliance with up to 4.6x speedup on benchmarks.","context_count":1,"top_context_role":"background","top_context_polarity":"support","context_text":"Although the level set provides a sharp geometric boundary, the ersatz- materialapproachapproximatesthestiffnessofcutelements (those intersected by the zero level set) by weighting it with their solid volume fraction [21], producing intermediate- density contributions analogous to those in SIMP but con- fined to a narrow band around the interface [41]. Applying the same extraction pipeline therefore enables direct com- parison across all stages and initialization strategies (SIMP, LS with porous initialization, LS with SIMP initialization) under the same discretization procedure. 4.2. Cantilever beam with corner supports The first test case considers a cantilever beam fixed at four cornerregionsonthe𝑥= 0face,eachoccupyinga0."},{"citing_arxiv_id":"2605.02838","ref_index":65,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"A second-order method landing on the Stiefel manifold via Newton$\\unicode{x2013}$Schulz iteration","primary_cat":"math.OC","submitted_at":"2026-05-04T17:18:11+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A second-order method achieves local quadratic convergence on the Stiefel manifold without retractions by combining a modified Newton tangent step with Newton-Schulz normal steps for constraint satisfaction.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.15855","ref_index":44,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"PISP: Projected-Space Inference of Stellar Parameters","primary_cat":"astro-ph.SR","submitted_at":"2026-04-17T09:05:18+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"PISP projects high-dimensional spectra into optimized subspaces using PCA or active subspaces plus L1 selection to raise accuracy and speed of stellar parameter inference over standard methods.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.09705","ref_index":66,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Sustainability-Constrained Workload Orchestration for Sovereign AI Infrastructure: A Joint Compute-Network Optimization Framework","primary_cat":"cs.NI","submitted_at":"2026-04-07T18:47:18+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"Introduces the Feasible Sovereign Operating Region (FSOR) as a construct for workloads sustainable under physical and regulatory limits, along with a joint compute-network optimization framework that enforces sustainability as hard constraints.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"characterization of both approaches. In penalty-based formulations, constraint violations are incorporated into the objective function as weighted cost terms, eﬀectively converting a constrained problem into an uncon- strained one [ 64, 65]. The general idea is to replace hard constraints by penalties and then exploit the well-developed machinery for unconstrained optimization [ 66]. As the penalty coeﬃcient tends to inﬁnity, the penalized solution converges to a solution of the original constrained problem under standard regularity conditions [ 65]. This approach has com- putational advantages-it does not restrict the feasible region and can be eﬀective when the constraint surface is non-convex or disconnected-but it carries a fundamental semantic"},{"citing_arxiv_id":"2604.05735","ref_index":42,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Does the total energy difference method for modelling core level photoemission fail for bigger molecules?","primary_cat":"physics.chem-ph","submitted_at":"2026-04-07T11:36:14+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"New gas-phase measurements of C 1s binding energies in anthrone agree with ΔSCF calculations, and a benchmark of 44 core levels in molecules with 10-40 atoms yields a mean absolute error of 0.19 eV.","context_count":1,"top_context_role":"method","top_context_polarity":"use_method","context_text":"electrons in the presence of the core hole. The core- hole-adapted basis sets used in this work are given in the supplementary information. The GW calculation of the valence level spectrum of anthrone was also performed in FHI-aims. Quasiparticle energy levels of the valence states were obtained using a \"one-shot\" G 0W0 calculation based on a PBE0 [42] starting point. In the implementation used in this work [43], the self-energy is first calculated on the imaginary frequency axis, and then analytically continued onto the real axis. For the analytic continuation, a Pade approximant with 16 parameters was employed. Quadruple-zeta numerical atom-centered basis sets with valence-correlation consistency from"},{"citing_arxiv_id":"2511.18998","ref_index":38,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"A trust-region funnel algorithm for gray-box optimization","primary_cat":"math.OC","submitted_at":"2025-11-24T11:23:48+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"A trust-region funnel algorithm for gray-box optimization achieves global convergence to first-order critical points and performs comparably or better than the classical trust-region filter method.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2511.03296","ref_index":2,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Technical results on the convergence of quasi-Newton methods for nonsmooth optimization","primary_cat":"math.OC","submitted_at":"2025-11-05T09:00:58+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Sufficient conditions on eigenvalue vanishing in quasi-Newton updates, observed numerically, are shown to imply convergence to criticality for piecewise differentiable nonsmooth functions, along with the method's ability to explore piecewise structure.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2509.01651","ref_index":66,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Trust-region filter algorithms utilizing Hessian information for gray-box optimization","primary_cat":"math.OC","submitted_at":"2025-09-01T17:53:06+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Four Hessian-informed trust-region filter variants using low- and high-fidelity surrogates reduce iterations and black-box evaluations by up to an order of magnitude on 25 benchmarks and five engineering cases while lowering tuning sensitivity.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2506.09211","ref_index":101,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"An Introduction to Solving the Least-Squares Problem in Variational Data Assimilation","primary_cat":"math.NA","submitted_at":"2025-06-10T20:02:51+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":2.0,"formal_verification":"none","one_line_summary":"This is an introductory review of the linear algebraic subproblems and contemporary solvers in variational data assimilation for geophysical applications.","context_count":1,"top_context_role":"background","top_context_polarity":"support","context_text":"LMPs are defined by (5.2) P = [I − Z(Z T AZ)−1Z T A][I − AZ(Z T AZ)−1Z T ] + θZ(Z T AZ)−1Z T , where Z is a matrix with ℓ linearly independent columns, and θ > 0 is a scaling parameter that is often set to 1 [52, 64]. Note that if Z spans the entire finite- dimensional space and θ = 1 then P = A−1. This family of LMPs is inspired by the BFGS method [101] for minimizing a nonlinear cost function by gradually approximating the inverse of the Hessian. Special cases of the LMP arise for particular choices of the columns of Z. Let the eigenpairs of A be (zi, λi) with the zi orthonormal and λ1 ≥ λ2 ≥ . . . ≥ λℓ > 1. If Z = [z1, . . . , zℓ] then the so-called spectral LMP [75] or deflating preconditioner [55, 64] is"},{"citing_arxiv_id":"2504.10435","ref_index":47,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"What metric to optimize for suppressing instability in a Vlasov-Poisson system?","primary_cat":"math.NA","submitted_at":"2025-04-14T17:26:09+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Comparison of objective functions for stabilizing the Vlasov-Poisson system shows that time-integrated metrics produce more convex optimization landscapes favorable to gradient-based methods.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2412.20425","ref_index":24,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"An Efficient Stochastic Subgradient Method for the Global Placement Problem in Very Large-Scale Integration Circuits","primary_cat":"math.OC","submitted_at":"2024-12-29T10:21:26+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"A ReLU-penalty formulation for VLSI global placement is solved via stochastic subgradient descent, with the first claimed convergence proof for ReLU-type nonsmooth nonconvex problems.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2411.09764","ref_index":17,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"ModelPredictiveControl.jl: advanced process control made easy in Julia","primary_cat":"eess.SY","submitted_at":"2024-11-14T19:21:24+00:00","verdict":"ACCEPT","verdict_confidence":"MODERATE","novelty_score":7.0,"formal_verification":"none","one_line_summary":"The paper presents ModelPredictiveControl.jl, an open-source Julia toolkit for model predictive control including nonlinear, economic, and successive linearization variants, illustrated with CSTR and inverted pendulum simulations and benchmarked against MATLAB.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2407.11703","ref_index":46,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Numerical Eigenvalue Optimization by Shape-Variations for Maxwell's Eigenvalue Problem","primary_cat":"math.OC","submitted_at":"2024-07-16T13:21:54+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Shape optimization of Maxwell eigenvalues via adjoint sensitivities on a reference domain, solved with a damped inverse BFGS method and mixed finite elements.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"1907.10121","ref_index":120,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"SciPy 1.0--Fundamental Algorithms for Scientific Computing in Python","primary_cat":"cs.MS","submitted_at":"2019-07-23T20:31:36+00:00","verdict":"ACCEPT","verdict_confidence":"MODERATE","novelty_score":2.0,"formal_verification":"none","one_line_summary":"SciPy 1.0 documents a mature open-source library that has become the de facto standard for scientific algorithms in Python with broad adoption across research projects.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}