mega hub Mixed citations

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, Jimmy Ba · 2014 · cs.LG · arXiv 1412.6980

Mixed citation behavior. Most common role is method (50%).

2073 Pith papers citing it

Method 50% of classified citations

open full Pith review browse 2073 citing papers more from Diederik P. Kingma arXiv PDF

abstract

We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

method 117 background 97 other 9 baseline 8 dataset 2

citation-polarity summary

use method 117 background 86 unclear 20 baseline 8 use dataset 2

claims ledger

abstract We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little

authors

Diederik P. Kingma Jimmy Ba

mega hub controls

export citing contexts JSON export graph JSON export full bundle JSON open full Pith review annotated reader queued

Recognition alignment

counterfactual ablation

If this work disappeared, these are the nearest dependency candidates in Pith, weighted toward method, dataset, baseline, and extension contexts where available. This is a structural signal, not a retraction verdict.

co-cited works

representative citing papers

GAIA: Geometry-Adaptive Operator Learning for Forward and Inverse Problems

cs.LG · 2026-07-01 · conditional · novelty 8.0

GAIA introduces a geometry-adaptive integral autoencoder that unifies forward, boundary-value, and inverse PDE operator learning on arbitrary domains via geometry tokens and cross-attention.

ShardNet: Training Neural Controllers with Hard, Non-Convex Constraints

eess.SY · 2026-06-29 · unverdicted · novelty 8.0

ShardNet enforces non-convex polyhedral safety constraints in neural controllers by construction via a differentiable projection layer, achieving 100% verified safety and over 3x larger safe sets than prior methods on double integrator benchmarks.

Adam Converges in Nonsmooth Nonconvex Optimization

math.OC · 2026-06-21 · unverdicted · novelty 8.0

The paper establishes the first finite-time convergence rate of 1/T^{2/13} for classical Adam (with bias correction, no extra steps) in nonsmooth nonconvex optimization under heavy-tailed noise with β1=β2.

Efficient AI-Inspired Reduction of Feynman Integrals via Tube Seeding

hep-ph · 2026-06-09 · unverdicted · novelty 8.0

Machine learning discovers a tube-seeding strategy for IBP reduction of Feynman integrals that scales linearly with numerator power, demonstrated on rank-20 2-loop 5-point integrals.

Test-time Adversarial Takeover: A Real-time Hijacking Interface against Robotic Diffusion Policies

cs.RO · 2026-06-09 · unverdicted · novelty 8.0

TAKO demonstrates real-time adversarial takeover of robotic diffusion policies via reusable universal patches on visual inputs, achieving 100% success in steering attacker-chosen trajectories across multiple tasks, encoders, and diffusion methods.

Adaptive directional gradients for parameterised quantum circuits

quant-ph · 2026-06-08 · unverdicted · novelty 8.0

Forward gradient framework for PQCs unifies SPSA and parameter-shift as limits, introduces QUIVER adaptive optimizer with closed-form measurement allocation, and demonstrates efficient training of 60-qubit circuits on ECG5000 and MNIST.

A multimodal dataset of photoplethysmography and continuous behavioral responses to ASMR and nature videos

cs.LG · 2026-05-30 · unverdicted · novelty 8.0

Introduces REST-ASMR multimodal dataset of PPG, stimuli, and continuous annotations for ASMR research, validated with 97% responder rate, significant agreement, PPG deceleration, and BiLSTM achieving 75.51% frame-level accuracy under strict subject-video independent 4-fold CV.

Neutron Star Equation of State via Physics Informed Neural Network

astro-ph.HE · 2026-05-29 · unverdicted · novelty 8.0

PINNs are used to non-parametrically infer the neutron star EOS from NICER and pulsar data, producing M_max = 2.06 M_sun, R_1.4 = 12.85 km, and a reproducible speed-of-sound softening at 2-4 rho_0 consistent with quark-hadron crossover.

Not All Inputs Are Valid: Towards Open-Set Video Moment Retrieval Using Language

cs.CV · 2026-05-28 · unverdicted · novelty 8.0

OpenVMR uses normalizing flow to detect out-of-distribution queries and performs moment retrieval only on in-distribution queries.

Canonical Regularisation of Wide Feature-Learning Neural Networks

stat.ML · 2026-05-18 · unverdicted · novelty 8.0

Derives geodesic ridge regularization and Riemannian Gibbs Process prior for feature-learning wide neural networks, generalizing kernel-regime results via function-space axiomatization.

ENSEMBITS: an alphabet of protein conformational ensembles

cs.LG · 2026-05-13 · unverdicted · novelty 8.0 · 2 refs

Ensembits is the first tokenizer of protein conformational ensembles that outperforms static tokenizers on RMSF prediction and matches them on function and mutation tasks while using less pretraining data.

Spherical Boltzmann machines: a solvable theory of learning and generation in energy-based models

cs.LG · 2026-05-09 · unverdicted · novelty 8.0

In the high-dimensional limit the spherical Boltzmann machine admits exact equations for training dynamics, Bayesian evidence, and cascades of phase transitions tied to mode alignment with data, which connect to generative phenomena including double descent and out-of-equilibrium biases.

Convergent Stochastic Training of Attention and Understanding LoRA

cs.LG · 2026-05-08 · unverdicted · novelty 8.0

Attention and LoRA regression losses induce Poincaré inequalities under mild regularization, so SGD-mimicking SDEs converge to minimizers with no assumptions on data or model size.

SLayerGen: a Crystal Generative Model for all Space and Layer Groups

cond-mat.mtrl-sci · 2026-05-07 · unverdicted · novelty 8.0

SLayerGen generates crystals invariant to any space or layer group via autoregressive lattice and Wyckoff sampling plus equivariant diffusion, achieving gains over bulk models on diperiodic materials after correcting a prior loss inconsistency for hexagonal groups.

3DSS: 3D Surface Splatting for Inverse Rendering

cs.GR · 2026-05-07 · unverdicted · novelty 8.0 · 3 refs

3DSS is the first differentiable surface splatting renderer that recovers shape, spatially-varying BRDF materials, and HDR illumination from multi-view images via a coverage-based compositing model derived from reconstruction kernels.

A Parameter-Free First-Order Algorithm for Non-Convex Optimization with $\tilde{\mkern1mu O}(\epsilon^{-5/3})$ Global Rate

math.OC · 2026-05-04 · conditional · novelty 8.0

PF-AGD is the first parameter-free deterministic accelerated first-order method with Õ(ε^{-5/3} log(1/ε)) complexity for smooth non-convex optimization.

STARE: Step-wise Temporal Alignment and Red-teaming Engine for Multi-modal Toxicity Attack

cs.CR · 2026-05-01 · unverdicted · novelty 8.0

STARE uses step-wise RL to attack multimodal models, achieving 68% higher attack success rate while revealing that adversarial optimization concentrates conceptual toxicity early and detail toxicity late in the generation trajectory.

Qvine: Vine Structured Quantum Circuits for Loading High Dimensional Distributions

quant-ph · 2026-04-29 · unverdicted · novelty 8.0

Qvine uses vine copula-inspired quantum circuit structures to achieve linear or quadratic depth scaling for loading high-dimensional distributions with high approximation quality.

Neural Spectral Bias and Conformal Correlators I: Introduction and Applications

hep-th · 2026-04-20 · unverdicted · novelty 8.0

Neural networks optimized solely on crossing symmetry reconstruct CFT correlators from minimal input data to few-percent accuracy across generalized free fields, minimal models, Ising, N=4 SYM, and AdS diagrams.

MMGait: Towards Multi-Modal Gait Recognition

cs.CV · 2026-04-17 · conditional · novelty 8.0

MMGait provides a new multi-sensor gait dataset and OmniGait baseline to support single-modal, cross-modal, and unified multi-modal person identification from walking patterns.

Proton Structure from Neural Simulation-Based Inference at the LHC

hep-ph · 2026-04-14 · unverdicted · novelty 8.0

Neural simulation-based inference on unbinned top-quark pair data at 13 TeV yields improved gluon PDF precision over traditional binned analyses while incorporating experimental and theoretical uncertainties.

Adam-HNAG: A Convergent Reformulation of Adam with Accelerated Rate

math.OC · 2026-04-09 · unverdicted · novelty 8.0

Adam-HNAG is a splitting-based reformulation of Adam that yields the first convergence proof for Adam-type methods, including accelerated rates, in convex smooth optimization.

CMCC-ReID: Cross-Modality Clothing-Change Person Re-Identification

cs.CV · 2026-04-03 · unverdicted · novelty 8.0

The paper introduces the CMCC-ReID task, constructs the SYSU-CMCC benchmark dataset, and proposes the PIA network with disentangling and prototype modules that outperforms prior methods on combined modality and clothing variations.

Traces of Helium Detected in Type Ic Supernova 2014L

astro-ph.HE · 2026-03-31 · accept · novelty 8.0

Quantitative Bayesian inference using a deep-learning emulator detects 0.018-0.020 M_sun of helium in the Type Ic supernova 2014L.

citing papers explorer

Showing 50 of 67 citing papers after filters.

LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning cs.AI · 2023-06-05 · conditional · none · ref 32 · internal anchor
LIBERO is a new benchmark for lifelong robot learning that evaluates transfer of declarative, procedural, and mixed knowledge across 130 manipulation tasks with provided demonstration data.
DragOn: A Benchmark and Dataset for Drag-Based GUI Interactions cs.AI · 2026-06-04 · unverdicted · none · ref 23 · internal anchor
DragOn provides a new drag-grounding benchmark and training dataset for GUI agents, with evaluations suggesting potential improvements on computer-use tasks.
Subliminal Learning Is Steering Vector Distillation cs.AI · 2026-05-31 · unverdicted · none · ref 19 · internal anchor
Subliminal learning is steering vector distillation: a student fine-tuned on a steered teacher's outputs learns to imitate the steering vector.
Model-Adaptive Tool Necessity Reveals the Knowing-Doing Gap in LLM Tool Use cs.AI · 2026-05-13 · unverdicted · none · ref 12 · 2 links · internal anchor
Model-adaptive tool necessity shows 26-54% mismatch with actual tool calls across LLMs, driven by nearly orthogonal hidden-state signals for cognition versus action.
Sequence Search: Automated Sequence Design using Neural Architecture Search cs.AI · 2026-04-16 · unverdicted · none · ref 22 · internal anchor
Sequence Search uses neural architecture search and a differentiable Bloch simulator to automatically create and optimize MRI pulse sequences that satisfy given design goals.
Hidden Reliability Risks in Large Language Models: Systematic Identification of Precision-Induced Output Disagreements cs.AI · 2026-04-02 · unverdicted · none · ref 23 · internal anchor
PrecisionDiff is a differential testing framework that uncovers widespread precision-induced behavioral disagreements in aligned LLMs, including safety-critical jailbreak divergences across precision formats.
Offline Materials Optimization with CliqueFlowmer cs.AI · 2026-03-06 · unverdicted · none · ref 9 · internal anchor
CliqueFlowmer combines clique-based model-based optimization with transformer and flow models to generate materials that optimize target properties better than generative baselines.
The Norm-Separation Delay Law of Grokking: A First-Principles Theory of Delayed Generalization cs.AI · 2026-03-05 · conditional · none · ref 6 · internal anchor
Grokking delay follows T_grok - T_mem = Θ(γ_eff^{-1} log(‖θ_mem‖² / ‖θ_post‖²)), derived from norm separation in regularized optimization and validated with high correlations across 293 runs.
Small Generalizable Prompt Predictive Models Can Steer Efficient RL Post-Training of Large Reasoning Models cs.AI · 2026-02-02 · unverdicted · none · ref 20 · internal anchor
GPS trains a small model on optimization history to predict prompt difficulty and select intermediate-difficulty diverse batches, yielding better training efficiency, final performance, and test-time allocation than baselines on reasoning benchmarks.
ID-PaS+ : Identity-Aware Predict-and-Search for General Mixed-Integer Linear Programs cs.AI · 2025-12-11 · unverdicted · none · ref 24 · internal anchor
ID-PaS+ introduces an identity-aware predict-and-search framework for general parametric MIPs that outperforms Gurobi and prior PAS methods on real-world large-scale instances.
One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms cs.AI · 2025-07-21 · conditional · none · ref 15 · internal anchor
OSPO trains optimal order dispatch policies for homogeneous AV fleets using only one-step group rewards, outperforming GRPO on a real ride-hailing dataset.
Mastering Diverse Domains through World Models cs.AI · 2023-01-10 · unverdicted · none · ref 61 · internal anchor
DreamerV3 uses world models and robustness techniques to solve over 150 tasks across domains with a single configuration, including Minecraft diamond collection from scratch.
A Generalist Agent cs.AI · 2022-05-12 · accept · none · ref 36 · internal anchor
Gato is a multi-modal, multi-task, multi-embodiment generalist policy using one transformer network to handle text, vision, games, and robotics tasks.
Leveraging systems' non-linearity to tackle the scarcity of data in the design of Intelligent Fault Diagnosis Systems cs.AI · 2026-06-18 · unverdicted · none · ref 20 · internal anchor
A periodic multi-excitation level procedure leverages system non-linearities to generate images for pre-trained CNNs in vibration-based intelligent fault diagnosis under data scarcity, validated experimentally on a railway pantograph.
Denoising Implicit Feedback for Cold-start Recommendation cs.AI · 2026-06-17 · unverdicted · none · ref 31 · internal anchor
DIF denoises implicit feedback for cold-start recommendation by inferring aggregated pseudo-labels from content-similar warm items and adaptively correcting noisy labels via relative entropy and cold-start uncertainty.
Momentum for Reasoning: Dense Intrinsic Signals in Policy Optimization cs.AI · 2026-06-07 · unverdicted · none · ref 41 · internal anchor
ISPO densifies GRPO rewards with sequence-level informativeness and token-level directional signals from policy probabilities to reduce zero-advantage collapse and hallucinated certainty on math benchmarks.
What Makes a Desired Graph for Relational Deep Learning? cs.AI · 2026-06-07 · unverdicted · none · ref 8 · internal anchor
Schema-derived graphs for relational deep learning suffer from information overload and semantic fragmentation; controlled filtering and injection via an end-to-end optimizer improves accuracy on 26 tasks while often lowering inference cost.
Structure-Induced Information for Rerooting Levin Tree Search cs.AI · 2026-05-28 · unverdicted · none · ref 4 · internal anchor
Three rerooter designs (clustering-based, heuristic-based, hybrid) for √LTS enable scalable search in complex single-agent environments where explicit subgoal methods fail and achieve SOTA online training efficiency.
Let Relations Speak: An End-to-End LLM-GNN Soft Prompt Framework for Fraud Detection cs.AI · 2026-05-27 · unverdicted · none · ref 1 · internal anchor
LGSPF uses soft prompts and a parallel GNN encoder to translate multi-relational graph topologies into tokens for LLM-based fraud detection, achieving SOTA on benchmarks.
Learning to Reason Efficiently with A* Post-Training cs.AI · 2026-05-23 · unverdicted · none · ref 3 · internal anchor
A* post-training lifts 1B-3B Llama-3.2 models from near-zero to above DeepSeek-V3.2 accuracy on deductive reasoning while trading off efficiency via process rewards.
Scaling Observation-aware Planning in Uncertain Domains cs.AI · 2026-05-21 · unverdicted · none · ref 18 · internal anchor
A POMDP decomposition method scales solving of the Sensor Selection Problem and Positional Observability Problem by 3 and 5 orders of magnitude in instance size and runtime.
A Conflict-aware Evidential Framework for Reliable Sleep Stage Classification cs.AI · 2026-05-16 · unverdicted · none · ref 49 · internal anchor
ConfSleepNet introduces a conflict-aware evidential aggregation method for multi-modal sleep stage classification using hybrid category structures per modality to produce reliable joint decisions with uncertainty.
Virtual Nodes Guided Dynamic Graph Neural Network for Brain Tumor Segmentation with Missing Modalities cs.AI · 2026-05-16 · unverdicted · none · ref 17 · internal anchor
A one-stage graph framework with modality-specific virtual nodes and dynamic adjacency adjustment for robust brain tumor segmentation under arbitrary missing MRI modalities, outperforming SOTA on BRATS-2018 and BRATS-2020 incomplete subsets.
Learning Developmental Scaffoldings to Guide Self-Organisation cs.AI · 2026-05-14 · unverdicted · none · ref 13 · internal anchor
Joint training of NCA rules and SIREN pre-patterns improves robustness, encoding capacity, and symmetry breaking compared to purely self-organizing models by offloading information to initial conditions.
Verifiable Process Rewards for Agentic Reasoning cs.AI · 2026-05-11 · unverdicted · none · ref 10 · 2 links · internal anchor
VPR converts symbolic, constraint, or posterior oracles into dense turn-level rewards for RL, improving credit assignment in agentic reasoning and transferring to general benchmarks.
Optimizer-Induced Mode Connectivity: From AdamW to Muon cs.AI · 2026-05-11 · unverdicted · none · ref 155 · internal anchor
Optimizer choice induces distinct connected regions in the loss landscape of two-layer ReLU networks, with AdamW and Muon sometimes separated by provable barriers.
HAGE: Harnessing Agentic Memory via RL-Driven Weighted Graph Evolution cs.AI · 2026-05-11 · unverdicted · none · ref 70 · internal anchor
HAGE proposes a trainable weighted graph memory framework with LLM intent classification, dynamic edge modulation, and RL optimization that improves long-horizon reasoning accuracy in agentic LLMs over static baselines.
Long-Horizon Q-Learning: Accurate Value Learning via n-Step Inequalities cs.AI · 2026-05-07 · unverdicted · none · ref 22 · 2 links · internal anchor
LQL turns n-step action-sequence lower bounds into a practical hinge-loss stabilizer for off-policy Q-learning without extra networks or forward passes.
ZAYA1-8B Technical Report cs.AI · 2026-05-06 · unverdicted · none · ref 61 · internal anchor
ZAYA1-8B is a reasoning MoE model with 700M active parameters that matches larger models on math and coding benchmarks and reaches 91.9% on AIME'25 via Markovian RSA test-time compute.
Adaptive Dual-Path Framework for Covert Semantic Communication cs.AI · 2026-05-05 · unverdicted · none · ref 48 · internal anchor
An adaptive dual-path framework for covert semantic communication achieves near-random attacker detection of 56.12% on Cityscapes while outperforming baselines on primary semantic tasks.
GeoDecider: A Coarse-to-Fine Agentic Workflow for Explainable Lithology Classification cs.AI · 2026-05-05 · unverdicted · none · ref 29 · internal anchor
GeoDecider introduces a coarse-to-fine agentic workflow using LLMs for explainable lithology classification from well logs, combining a base classifier, tool-augmented reasoning, and geological refinement to outperform baselines on benchmarks.
Triple Spectral Fusion for Sensor-based Human Activity Recognition cs.AI · 2026-05-04 · unverdicted · none · ref 68 · internal anchor
A triple spectral fusion method using adaptive filtering in three domains improves human activity recognition from inertial sensors on benchmark datasets.
Anon: Extrapolating Adaptivity Beyond SGD and Adam cs.AI · 2026-05-04 · unverdicted · none · ref 6 · internal anchor
Anon optimizer uses tunable adaptivity and incremental delay update to achieve convergence guarantees and outperform existing methods on image classification, diffusion, and language modeling tasks.
ResearchEVO: An End-to-End Framework for Automated Scientific Discovery and Documentation cs.AI · 2026-04-07 · unverdicted · none · ref 13 · internal anchor
ResearchEVO automates the discover-then-explain cycle by evolving algorithms via fitness-driven LLM co-evolution and generating grounded, anti-hallucination research papers through sentence-level RAG.
TRU: Targeted Reverse Update for Efficient Multimodal Recommendation Unlearning cs.AI · 2026-04-02 · unverdicted · none · ref 16 · internal anchor
TRU is a plug-and-play unlearning method for multimodal recommenders that applies ranking fusion, modality scaling, and layer isolation to achieve better retain-forget trade-offs than uniform baselines.
BlindGuard: Safeguarding LLM-based Multi-Agent Systems under Unknown Attacks cs.AI · 2025-08-11 · unverdicted · none · ref 8 · internal anchor
BlindGuard introduces an unsupervised hierarchical agent encoder plus corruption-guided contrastive detector that identifies malicious agents in LLM-based multi-agent systems without any attack labels or prior knowledge of malicious behaviors.
Action-Gradient Monte Carlo Tree Search for Non-Parametric Continuous (PO)MDPs cs.AI · 2025-03-15 · unverdicted · none · ref 27 · internal anchor
AGMCTS augments MCTS with action-score gradients for particle beliefs, a Multiple Importance Sampling tree for reuse, and Area Formula gradients for smooth models, outperforming prior sample-based solvers on continuous benchmarks.
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models cs.AI · 2024-08-01 · conditional · none · ref 17 · internal anchor
Empirical analysis shows scaling inference compute via strategies like tree search can be more efficient than scaling model parameters, with 7B models plus novel search outperforming 34B models.
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts cs.AI · 2023-09-19 · unverdicted · none · ref 29 · internal anchor
GPTFuzz is a black-box fuzzing framework that mutates seed jailbreak templates to automatically generate effective attacks, achieving over 90% success rates on models including ChatGPT and Llama-2.
Probabilistic Approximate Logic and its Implementation in the Logical Imagination Engine cs.AI · 2019-07-25 · unverdicted · none · ref 17 · internal anchor
PALO is a new approximate probabilistic logic with continuous semantics and SGD/MCMC inference, implemented as LIME in TensorFlow and demonstrated on bioinformatics network synthesis.
DeepMind Control Suite cs.AI · 2018-01-02 · accept · none · ref 6 · internal anchor
The DeepMind Control Suite supplies a standardized collection of continuous control tasks with interpretable rewards for benchmarking reinforcement learning agents.
Sequential Fairness Auditing with Limited Output Access cs.AI · 2026-06-29 · unverdicted · none · ref 11 · internal anchor
The paper introduces a sequential generalized likelihood-ratio test framework for auditing Statistical Parity and Equal Opportunity fairness metrics under limited model query access.
PMDformer: Patch-Mean Decoupling Information Transformer for Long-term Forecasting cs.AI · 2026-06-25 · unverdicted · none · ref 9 · internal anchor
PMDformer uses patch-mean decoupling, trend restoration attention, and proximal variable attention to improve accuracy and stability in long-term time series forecasting benchmarks.
Explaining Black-Box Language Models: Learning to Optimize Linguistically-Structured Word Subsets cs.AI · 2026-06-07 · unverdicted · none · ref 28 · internal anchor
Amortized optimization with policy gradients and graph knowledge selects informative word subsets to explain black-box DLM outputs.
Query-Conditioned Knowledge Alignment for Reliable Cross-System Medical Reasoning cs.AI · 2026-05-18 · conditional · none · ref 14 · internal anchor
QCEA reformulates entity alignment as a query-conditioned ranking task with semantic encoding, graph learning, and direction-aware transformation to handle context-dependent, asymmetric correspondences in medical knowledge graphs.
RADAR: Redundancy-Aware Diffusion for Multi-Agent Communication Structure Generation cs.AI · 2026-05-11 · unverdicted · none · ref 35 · internal anchor
RADAR generates query-adaptive multi-agent communication structures via conditional discrete graph diffusion guided by effective graph size, outperforming baselines on accuracy and token consumption across six benchmarks.
SCGNN: Semantic Consistency enhanced Graph Neural Network Guided by Granular-ball Computing cs.AI · 2026-05-04 · unverdicted · none · ref 6 · internal anchor
SCGNN uses granular-ball computing to partition nodes into groups, builds an anchor-based augmented graph, and fuses predictions with label-consistency supervision to improve semantic consistency in GNNs.
Generative Design of a Gas Turbine Combustor Using Invertible Neural Networks cs.AI · 2026-04-27 · unverdicted · none · ref 30 · internal anchor
Invertible Neural Networks are used to generate gas turbine combustor designs that meet specified performance criteria from a training database of parameterized designs and simulations.
MCPO: Mastery-Consolidated Policy Optimization for Large Reasoning Models cs.AI · 2026-04-18 · unverdicted · none · ref 36 · internal anchor
MCPO fixes vanishing training signals and shrinking weights in GRPO by using a hinge-KL regularizer on mastered prompts and prioritizing majority-correct prompts, yielding higher pass@1 and pass@k on math tasks.
Safe reinforcement learning with online filtering for fatigue-predictive human-robot task planning and allocation in production cs.AI · 2026-04-14 · unverdicted · none · ref 40 · internal anchor
PF-CD3Q uses online particle filtering to estimate fatigue parameters and constrains a deep Q-learning agent to solve fatigue-aware human-robot task planning as a CMDP.