GAIA introduces a geometry-adaptive integral autoencoder that unifies forward, boundary-value, and inverse PDE operator learning on arbitrary domains via geometry tokens and cross-attention.
mega hub Mixed citations
Adam: A Method for Stochastic Optimization
Mixed citation behavior. Most common role is method (50%).
abstract
We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.
hub tools
citation-role summary
citation-polarity summary
claims ledger
- abstract We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little
authors
mega hub controls
Recognition alignment
counterfactual ablation
co-cited works
representative citing papers
ShardNet enforces non-convex polyhedral safety constraints in neural controllers by construction via a differentiable projection layer, achieving 100% verified safety and over 3x larger safe sets than prior methods on double integrator benchmarks.
The paper establishes the first finite-time convergence rate of 1/T^{2/13} for classical Adam (with bias correction, no extra steps) in nonsmooth nonconvex optimization under heavy-tailed noise with β1=β2.
Machine learning discovers a tube-seeding strategy for IBP reduction of Feynman integrals that scales linearly with numerator power, demonstrated on rank-20 2-loop 5-point integrals.
TAKO demonstrates real-time adversarial takeover of robotic diffusion policies via reusable universal patches on visual inputs, achieving 100% success in steering attacker-chosen trajectories across multiple tasks, encoders, and diffusion methods.
Forward gradient framework for PQCs unifies SPSA and parameter-shift as limits, introduces QUIVER adaptive optimizer with closed-form measurement allocation, and demonstrates efficient training of 60-qubit circuits on ECG5000 and MNIST.
Introduces REST-ASMR multimodal dataset of PPG, stimuli, and continuous annotations for ASMR research, validated with 97% responder rate, significant agreement, PPG deceleration, and BiLSTM achieving 75.51% frame-level accuracy under strict subject-video independent 4-fold CV.
PINNs are used to non-parametrically infer the neutron star EOS from NICER and pulsar data, producing M_max = 2.06 M_sun, R_1.4 = 12.85 km, and a reproducible speed-of-sound softening at 2-4 rho_0 consistent with quark-hadron crossover.
OpenVMR uses normalizing flow to detect out-of-distribution queries and performs moment retrieval only on in-distribution queries.
Derives geodesic ridge regularization and Riemannian Gibbs Process prior for feature-learning wide neural networks, generalizing kernel-regime results via function-space axiomatization.
Ensembits is the first tokenizer of protein conformational ensembles that outperforms static tokenizers on RMSF prediction and matches them on function and mutation tasks while using less pretraining data.
In the high-dimensional limit the spherical Boltzmann machine admits exact equations for training dynamics, Bayesian evidence, and cascades of phase transitions tied to mode alignment with data, which connect to generative phenomena including double descent and out-of-equilibrium biases.
Attention and LoRA regression losses induce Poincaré inequalities under mild regularization, so SGD-mimicking SDEs converge to minimizers with no assumptions on data or model size.
SLayerGen generates crystals invariant to any space or layer group via autoregressive lattice and Wyckoff sampling plus equivariant diffusion, achieving gains over bulk models on diperiodic materials after correcting a prior loss inconsistency for hexagonal groups.
3DSS is the first differentiable surface splatting renderer that recovers shape, spatially-varying BRDF materials, and HDR illumination from multi-view images via a coverage-based compositing model derived from reconstruction kernels.
PF-AGD is the first parameter-free deterministic accelerated first-order method with Õ(ε^{-5/3} log(1/ε)) complexity for smooth non-convex optimization.
STARE uses step-wise RL to attack multimodal models, achieving 68% higher attack success rate while revealing that adversarial optimization concentrates conceptual toxicity early and detail toxicity late in the generation trajectory.
Qvine uses vine copula-inspired quantum circuit structures to achieve linear or quadratic depth scaling for loading high-dimensional distributions with high approximation quality.
Neural networks optimized solely on crossing symmetry reconstruct CFT correlators from minimal input data to few-percent accuracy across generalized free fields, minimal models, Ising, N=4 SYM, and AdS diagrams.
MMGait provides a new multi-sensor gait dataset and OmniGait baseline to support single-modal, cross-modal, and unified multi-modal person identification from walking patterns.
Neural simulation-based inference on unbinned top-quark pair data at 13 TeV yields improved gluon PDF precision over traditional binned analyses while incorporating experimental and theoretical uncertainties.
Adam-HNAG is a splitting-based reformulation of Adam that yields the first convergence proof for Adam-type methods, including accelerated rates, in convex smooth optimization.
The paper introduces the CMCC-ReID task, constructs the SYSU-CMCC benchmark dataset, and proposes the PIA network with disentangling and prototype modules that outperforms prior methods on combined modality and clothing variations.
Quantitative Bayesian inference using a deep-learning emulator detects 0.018-0.020 M_sun of helium in the Type Ic supernova 2014L.
citing papers explorer
-
LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning
LIBERO is a new benchmark for lifelong robot learning that evaluates transfer of declarative, procedural, and mixed knowledge across 130 manipulation tasks with provided demonstration data.
-
DragOn: A Benchmark and Dataset for Drag-Based GUI Interactions
DragOn provides a new drag-grounding benchmark and training dataset for GUI agents, with evaluations suggesting potential improvements on computer-use tasks.
-
Subliminal Learning Is Steering Vector Distillation
Subliminal learning is steering vector distillation: a student fine-tuned on a steered teacher's outputs learns to imitate the steering vector.
-
Model-Adaptive Tool Necessity Reveals the Knowing-Doing Gap in LLM Tool Use
Model-adaptive tool necessity shows 26-54% mismatch with actual tool calls across LLMs, driven by nearly orthogonal hidden-state signals for cognition versus action.
-
Sequence Search: Automated Sequence Design using Neural Architecture Search
Sequence Search uses neural architecture search and a differentiable Bloch simulator to automatically create and optimize MRI pulse sequences that satisfy given design goals.
-
Hidden Reliability Risks in Large Language Models: Systematic Identification of Precision-Induced Output Disagreements
PrecisionDiff is a differential testing framework that uncovers widespread precision-induced behavioral disagreements in aligned LLMs, including safety-critical jailbreak divergences across precision formats.
-
Offline Materials Optimization with CliqueFlowmer
CliqueFlowmer combines clique-based model-based optimization with transformer and flow models to generate materials that optimize target properties better than generative baselines.
-
The Norm-Separation Delay Law of Grokking: A First-Principles Theory of Delayed Generalization
Grokking delay follows T_grok - T_mem = Θ(γ_eff^{-1} log(‖θ_mem‖² / ‖θ_post‖²)), derived from norm separation in regularized optimization and validated with high correlations across 293 runs.
-
Small Generalizable Prompt Predictive Models Can Steer Efficient RL Post-Training of Large Reasoning Models
GPS trains a small model on optimization history to predict prompt difficulty and select intermediate-difficulty diverse batches, yielding better training efficiency, final performance, and test-time allocation than baselines on reasoning benchmarks.
-
ID-PaS+ : Identity-Aware Predict-and-Search for General Mixed-Integer Linear Programs
ID-PaS+ introduces an identity-aware predict-and-search framework for general parametric MIPs that outperforms Gurobi and prior PAS methods on real-world large-scale instances.
-
One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms
OSPO trains optimal order dispatch policies for homogeneous AV fleets using only one-step group rewards, outperforming GRPO on a real ride-hailing dataset.
-
Mastering Diverse Domains through World Models
DreamerV3 uses world models and robustness techniques to solve over 150 tasks across domains with a single configuration, including Minecraft diamond collection from scratch.
-
A Generalist Agent
Gato is a multi-modal, multi-task, multi-embodiment generalist policy using one transformer network to handle text, vision, games, and robotics tasks.
-
Leveraging systems' non-linearity to tackle the scarcity of data in the design of Intelligent Fault Diagnosis Systems
A periodic multi-excitation level procedure leverages system non-linearities to generate images for pre-trained CNNs in vibration-based intelligent fault diagnosis under data scarcity, validated experimentally on a railway pantograph.
-
Denoising Implicit Feedback for Cold-start Recommendation
DIF denoises implicit feedback for cold-start recommendation by inferring aggregated pseudo-labels from content-similar warm items and adaptively correcting noisy labels via relative entropy and cold-start uncertainty.
-
Momentum for Reasoning: Dense Intrinsic Signals in Policy Optimization
ISPO densifies GRPO rewards with sequence-level informativeness and token-level directional signals from policy probabilities to reduce zero-advantage collapse and hallucinated certainty on math benchmarks.
-
What Makes a Desired Graph for Relational Deep Learning?
Schema-derived graphs for relational deep learning suffer from information overload and semantic fragmentation; controlled filtering and injection via an end-to-end optimizer improves accuracy on 26 tasks while often lowering inference cost.
-
Structure-Induced Information for Rerooting Levin Tree Search
Three rerooter designs (clustering-based, heuristic-based, hybrid) for √LTS enable scalable search in complex single-agent environments where explicit subgoal methods fail and achieve SOTA online training efficiency.
-
Let Relations Speak: An End-to-End LLM-GNN Soft Prompt Framework for Fraud Detection
LGSPF uses soft prompts and a parallel GNN encoder to translate multi-relational graph topologies into tokens for LLM-based fraud detection, achieving SOTA on benchmarks.
-
Learning to Reason Efficiently with A* Post-Training
A* post-training lifts 1B-3B Llama-3.2 models from near-zero to above DeepSeek-V3.2 accuracy on deductive reasoning while trading off efficiency via process rewards.
-
Scaling Observation-aware Planning in Uncertain Domains
A POMDP decomposition method scales solving of the Sensor Selection Problem and Positional Observability Problem by 3 and 5 orders of magnitude in instance size and runtime.
-
A Conflict-aware Evidential Framework for Reliable Sleep Stage Classification
ConfSleepNet introduces a conflict-aware evidential aggregation method for multi-modal sleep stage classification using hybrid category structures per modality to produce reliable joint decisions with uncertainty.
-
Virtual Nodes Guided Dynamic Graph Neural Network for Brain Tumor Segmentation with Missing Modalities
A one-stage graph framework with modality-specific virtual nodes and dynamic adjacency adjustment for robust brain tumor segmentation under arbitrary missing MRI modalities, outperforming SOTA on BRATS-2018 and BRATS-2020 incomplete subsets.
-
Learning Developmental Scaffoldings to Guide Self-Organisation
Joint training of NCA rules and SIREN pre-patterns improves robustness, encoding capacity, and symmetry breaking compared to purely self-organizing models by offloading information to initial conditions.
-
Verifiable Process Rewards for Agentic Reasoning
VPR converts symbolic, constraint, or posterior oracles into dense turn-level rewards for RL, improving credit assignment in agentic reasoning and transferring to general benchmarks.
-
Optimizer-Induced Mode Connectivity: From AdamW to Muon
Optimizer choice induces distinct connected regions in the loss landscape of two-layer ReLU networks, with AdamW and Muon sometimes separated by provable barriers.
-
HAGE: Harnessing Agentic Memory via RL-Driven Weighted Graph Evolution
HAGE proposes a trainable weighted graph memory framework with LLM intent classification, dynamic edge modulation, and RL optimization that improves long-horizon reasoning accuracy in agentic LLMs over static baselines.
-
Long-Horizon Q-Learning: Accurate Value Learning via n-Step Inequalities
LQL turns n-step action-sequence lower bounds into a practical hinge-loss stabilizer for off-policy Q-learning without extra networks or forward passes.
-
ZAYA1-8B Technical Report
ZAYA1-8B is a reasoning MoE model with 700M active parameters that matches larger models on math and coding benchmarks and reaches 91.9% on AIME'25 via Markovian RSA test-time compute.
-
Adaptive Dual-Path Framework for Covert Semantic Communication
An adaptive dual-path framework for covert semantic communication achieves near-random attacker detection of 56.12% on Cityscapes while outperforming baselines on primary semantic tasks.
-
GeoDecider: A Coarse-to-Fine Agentic Workflow for Explainable Lithology Classification
GeoDecider introduces a coarse-to-fine agentic workflow using LLMs for explainable lithology classification from well logs, combining a base classifier, tool-augmented reasoning, and geological refinement to outperform baselines on benchmarks.
-
Triple Spectral Fusion for Sensor-based Human Activity Recognition
A triple spectral fusion method using adaptive filtering in three domains improves human activity recognition from inertial sensors on benchmark datasets.
-
Anon: Extrapolating Adaptivity Beyond SGD and Adam
Anon optimizer uses tunable adaptivity and incremental delay update to achieve convergence guarantees and outperform existing methods on image classification, diffusion, and language modeling tasks.
-
ResearchEVO: An End-to-End Framework for Automated Scientific Discovery and Documentation
ResearchEVO automates the discover-then-explain cycle by evolving algorithms via fitness-driven LLM co-evolution and generating grounded, anti-hallucination research papers through sentence-level RAG.
-
TRU: Targeted Reverse Update for Efficient Multimodal Recommendation Unlearning
TRU is a plug-and-play unlearning method for multimodal recommenders that applies ranking fusion, modality scaling, and layer isolation to achieve better retain-forget trade-offs than uniform baselines.
-
BlindGuard: Safeguarding LLM-based Multi-Agent Systems under Unknown Attacks
BlindGuard introduces an unsupervised hierarchical agent encoder plus corruption-guided contrastive detector that identifies malicious agents in LLM-based multi-agent systems without any attack labels or prior knowledge of malicious behaviors.
-
Action-Gradient Monte Carlo Tree Search for Non-Parametric Continuous (PO)MDPs
AGMCTS augments MCTS with action-score gradients for particle beliefs, a Multiple Importance Sampling tree for reuse, and Area Formula gradients for smooth models, outperforming prior sample-based solvers on continuous benchmarks.
-
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Empirical analysis shows scaling inference compute via strategies like tree search can be more efficient than scaling model parameters, with 7B models plus novel search outperforming 34B models.
-
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
GPTFuzz is a black-box fuzzing framework that mutates seed jailbreak templates to automatically generate effective attacks, achieving over 90% success rates on models including ChatGPT and Llama-2.
-
Probabilistic Approximate Logic and its Implementation in the Logical Imagination Engine
PALO is a new approximate probabilistic logic with continuous semantics and SGD/MCMC inference, implemented as LIME in TensorFlow and demonstrated on bioinformatics network synthesis.
-
DeepMind Control Suite
The DeepMind Control Suite supplies a standardized collection of continuous control tasks with interpretable rewards for benchmarking reinforcement learning agents.
-
Sequential Fairness Auditing with Limited Output Access
The paper introduces a sequential generalized likelihood-ratio test framework for auditing Statistical Parity and Equal Opportunity fairness metrics under limited model query access.
-
PMDformer: Patch-Mean Decoupling Information Transformer for Long-term Forecasting
PMDformer uses patch-mean decoupling, trend restoration attention, and proximal variable attention to improve accuracy and stability in long-term time series forecasting benchmarks.
-
Explaining Black-Box Language Models: Learning to Optimize Linguistically-Structured Word Subsets
Amortized optimization with policy gradients and graph knowledge selects informative word subsets to explain black-box DLM outputs.
-
Query-Conditioned Knowledge Alignment for Reliable Cross-System Medical Reasoning
QCEA reformulates entity alignment as a query-conditioned ranking task with semantic encoding, graph learning, and direction-aware transformation to handle context-dependent, asymmetric correspondences in medical knowledge graphs.
-
RADAR: Redundancy-Aware Diffusion for Multi-Agent Communication Structure Generation
RADAR generates query-adaptive multi-agent communication structures via conditional discrete graph diffusion guided by effective graph size, outperforming baselines on accuracy and token consumption across six benchmarks.
-
SCGNN: Semantic Consistency enhanced Graph Neural Network Guided by Granular-ball Computing
SCGNN uses granular-ball computing to partition nodes into groups, builds an anchor-based augmented graph, and fuses predictions with label-consistency supervision to improve semantic consistency in GNNs.
-
Generative Design of a Gas Turbine Combustor Using Invertible Neural Networks
Invertible Neural Networks are used to generate gas turbine combustor designs that meet specified performance criteria from a training database of parameterized designs and simulations.
-
MCPO: Mastery-Consolidated Policy Optimization for Large Reasoning Models
MCPO fixes vanishing training signals and shrinking weights in GRPO by using a hinge-KL regularizer on mastered prompts and prioritizing majority-correct prompts, yielding higher pass@1 and pass@k on math tasks.
-
Safe reinforcement learning with online filtering for fatigue-predictive human-robot task planning and allocation in production
PF-CD3Q uses online particle filtering to estimate fatigue parameters and constrains a deep Q-learning agent to solve fatigue-aware human-robot task planning as a CMDP.