{"total":14,"items":[{"citing_arxiv_id":"2605.31559","ref_index":6,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Functional Attention: From Pairwise Affinities to Functional Correspondences","primary_cat":"cs.LG","submitted_at":"2026-05-29T17:22:19+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Functional Attention replaces pairwise softmax attention with structured linear operators inspired by geometric functional maps to produce compact, resolution-invariant representations for operator learning.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.17848","ref_index":15,"ref_count":2,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Learning Empirical Evidence Equilibria under Weak Environmental Coupling","primary_cat":"cs.GT","submitted_at":"2026-05-18T04:42:30+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Decentralized Q-learning agents reach an Empirical Evidence Equilibrium in weakly coupled dynamic environments.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.16809","ref_index":54,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Informative Graph Structure Learning","primary_cat":"cs.LG","submitted_at":"2026-05-16T04:46:59+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"InGSL reduces edge redundancy in existing graph structure learning methods by adding a mutual-information-guided diversity term, delivering better results with fewer edges across six tested frameworks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.15651","ref_index":4,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Sharp Spectral Thresholds for Logit Fixed Points","primary_cat":"cs.LG","submitted_at":"2026-05-15T06:11:20+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"For finite-dimensional affine logit systems the sharp dimension-free stability threshold is β‖ΠWΠ‖_{T→T}<2, extending the certified regime beyond classical conservative bounds.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.09813","ref_index":57,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Optimizing Server Placement for Vertical Federated Learning in Dynamic Edge/Fog Networks","primary_cat":"cs.NI","submitted_at":"2026-05-10T23:30:18+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SC-DN establishes a global first-order stationary point per round and solves a mixed-integer signomial program to optimize four control variables for VFL, yielding better classification performance and lower resource use than greedy baselines on image and multi-modal data.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"energies, and (iii) server movement energy. In particular, (a)models the VFL convergence via the bound for first- order stationary points in Theorem 1 multiplied by a relative importance term,γ r n. The result in Theorem 1 captures the ML model training hyperparameter heterogeneity as well as the progression of local ML model training within the Lipschitz- smooth coefficients [55], [56], [57], [58] and we abstract the relationships between data feature quality/quantity and ML architecture by introducing a relative importance factor, γr n >1∀n∈ N r, r∈ Rto ensure that the upper bound for the first-order stationary point is maintained. To estimateγ r, we develop Algorithm 2, which re-assesses devices' impact to overall performance of VFL via iteratively testing exclusion"},{"citing_arxiv_id":"2605.08689","ref_index":43,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Structure-Centric Graph Foundation Model via Geometric Bases","primary_cat":"cs.LG","submitted_at":"2026-05-09T04:56:21+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"SCGFM creates transferable graph representations by aligning heterogeneous topologies to shared learnable geometric bases via Gromov-Wasserstein distances and re-encoding features accordingly.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.20551","ref_index":112,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"On Bayesian Softmax-Gated Mixture-of-Experts Models","primary_cat":"stat.ML","submitted_at":"2026-04-22T13:37:00+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Bayesian softmax-gated mixture-of-experts models achieve posterior contraction for density estimation and parameter recovery using Voronoi losses, plus two strategies for choosing the number of experts.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.20276","ref_index":48,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Rethinking Intrinsic Dimension Estimation in Neural Representations","primary_cat":"cs.LG","submitted_at":"2026-04-22T07:24:53+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Common ID estimators fail to track the true intrinsic dimension of neural representations and are instead driven by other factors.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.14381","ref_index":51,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Learning Cut Distributions with Quantum Optimization","primary_cat":"quant-ph","submitted_at":"2026-04-15T19:55:14+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"QAOA ansatz with finite layers can capture any bitstring distribution and solves the Fair Cut Cover problem with provable and empirical advantages over classical approximations on certain graphs.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2512.04745","ref_index":19,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Neural Policy Composition from Free Energy Minimization","primary_cat":"math.OC","submitted_at":"2025-12-04T12:31:41+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Policy composition emerges from variational free energy minimization through a convergent gradient flow with a soft-competitive recurrent neural implementation.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2510.16132","ref_index":55,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"A Minimal-Assumption Analysis of Q-Learning with Time-Varying Policies","primary_cat":"cs.LG","submitted_at":"2025-10-17T18:19:00+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Establishes last-iterate convergence rates for on-policy Q-learning under minimal irreducibility assumptions, with sample complexity O(1/ξ²) matching off-policy up to exploration factors.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2505.19525","ref_index":10,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Rethinking Gating Mechanism in Sparse MoE: Handling Arbitrary Modality Inputs with Confidence-Guided Gate","primary_cat":"cs.LG","submitted_at":"2025-05-26T05:18:55+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"ConfSMoE adds expert-opinion imputation and detaches softmax routing scores to ground-truth task confidence to relieve expert collapse in SMoE without extra load-balance losses, evaluated on four real-world datasets.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2404.14442","ref_index":33,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Toward a Unified Lyapunov-Certified ODE Convergence Analysis of Smooth Q-Learning with p-Norms","primary_cat":"cs.LG","submitted_at":"2024-04-20T01:16:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Unified ODE convergence analysis for smooth Q-learning variants via p-norm Lyapunov functions, valid even when the Bellman operator is not a contraction.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"1906.11148","ref_index":49,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Chaining Meets Chain Rule: Multilevel Entropic Regularization and Training of Neural Nets","primary_cat":"cs.LG","submitted_at":"2019-06-26T15:05:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Derives algorithm-dependent generalization bounds for neural nets using multilevel entropic regularization and proposes a Metropolis-simulated multi-scale Gibbs training procedure tested on a two-layer net for MNIST.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}