Title resolution pending

Machine Learning: An Artificial Intelligence Approach, Vol · 1983

29 Pith papers cite this work. Polarity classification is still indexing.

29 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Many Needles in a Haystack: Active Hit Discovery for Perturbation Experiments

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

Probability-of-Hit acquisition function ranks perturbation candidates by posterior probability of threshold exceedance, with asymptotic optimality proof and up to 6.4% gains on real immunology data.

Fix the Loss, Not the Radius: Rethinking the Adversarial Perturbation of Sharpness-Aware Minimization

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

LE-SAM inverts SAM by fixing the loss budget instead of the parameter-space radius, yielding better generalization across benchmarks.

ResRL: Boosting LLM Reasoning via Negative Sample Projection Residual Reinforcement Learning

cs.LG · 2026-05-01 · unverdicted · novelty 7.0

ResRL decouples shared semantics between positive and negative responses in LLM reinforcement learning via SVD-based projection residuals, outperforming baselines including NSR by up to 9.4% on math reasoning benchmarks.

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

cs.CL · 2024-12-30 · unverdicted · novelty 7.0

o1-like models overthink easy tasks; self-training reduces compute use without accuracy loss on GSM8K, MATH500, GPQA, and AIME.

Fast Transformer Decoding: One Write-Head is All You Need

cs.NE · 2019-11-06 · unverdicted · novelty 7.0

Multi-query attention shares keys and values across heads in Transformers, greatly reducing memory bandwidth for faster decoding with only minor quality loss.

Frequency Bias and OOD Generalization in Neural Operators under a Variable-Coefficient Wave Equation

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

FNO exhibits strong frequency bias with sharp OOD error growth on high-frequency inputs in wave equations, while DeepONet shows milder degradation despite higher baseline error.

TokenRatio: Principled Token-Level Preference Optimization via Ratio Matching

cs.CL · 2026-05-12 · unverdicted · novelty 6.0

TBPO derives a token-level preference optimization objective from sequence-level pairwise data via Bregman divergence ratio matching that generalizes DPO and improves alignment quality.

CTFusion: A CTF-based Benchmark for LLM Agent Evaluation

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

CTFusion is a live-CTF streaming benchmark that prevents data contamination by forwarding only the first correct flag per challenge under a shared team account.

Learning Graph Foundation Models on Riemannian Graph-of-Graphs

cs.LG · 2026-05-11 · unverdicted · novelty 6.0

R-GFM constructs multi-scale Riemannian graph-of-graphs to learn geometry-adaptive representations, reducing structural domain generalization error and delivering up to 49% relative gains on downstream graph tasks.

The general regularisation scheme applied to conditional density estimation

math.ST · 2026-05-10 · unverdicted · novelty 6.0

The general regularization scheme is extended to conditional density estimation, yielding a new estimator with proven convergence rates that matches or beats the Nadaraya-Watson estimator in experiments.

Modeling Implicit Conflict Monitoring Mechanisms against Stereotypes in LLMs

cs.SI · 2026-05-10 · unverdicted · novelty 6.0

LLMs contain identifiable COCO neurons that enable implicit self-correction against stereotypes; targeted editing of these neurons improves fairness and robustness to jailbreaks while preserving generation quality.

Minimal Filling Architectures of Polynomial Neural Networks: Counterexamples, Frontier Search, and Defects

cs.LG · 2026-05-10 · unverdicted · novelty 6.0

A counterexample disproves the conjecture that minimal filling architectures of polynomial neural networks always have unimodal hidden layer widths.

QueST: Persistent Queries as Semantic Monitors for Drift Suppression in Long-Horizon Tracking

cs.CV · 2026-05-10 · unverdicted · novelty 6.0

QueST replaces local point tracking with persistent semantic queries that globally attend to spatio-temporal features and apply 3D grounding to suppress drift, cutting absolute point error by 67.7% versus TAP-Net on long articulated sequences.

From Passive Reuse to Active Reasoning: Grounding Large Language Models for Neuro-Symbolic Experience Replay

cs.AI · 2026-05-10 · unverdicted · novelty 6.0

NSER uses zero-shot LLMs to induce behavioral rules from RL trajectories, grounds them in differentiable first-order logic, and applies the symbolic structures to dynamically reweight experience replay for better sample efficiency.

Adversary-Robust Learning from Fully Asynchronous Directional Derivative Estimates

cs.LG · 2026-05-10 · unverdicted · novelty 6.0

FAR-SIGN achieves adversary-resilient fully asynchronous optimization via signed directional projections and two-timescale correction, with almost-sure convergence to stationary points at rates O(n^{-1/4+ε}) first-order and O(n^{-1/6+ε}) zeroth-order.

Kinematics-Driven Gaussian Shape Deformation for Blurry Monocular Dynamic Scenes

cs.CV · 2026-05-09 · unverdicted · novelty 6.0

Kinematics-GS reparameterizes Gaussian shapes along motion trajectories with a kinematic prior to reconstruct dynamic 3D scenes from blurry monocular videos by separating dynamic and static components and using coarse-to-fine optimization.

MARLaaS: Multi-Tenant Asynchronous Reinforcement Learning as a Service

cs.DC · 2026-05-08 · unverdicted · novelty 6.0

MARLaaS enables concurrent RL fine-tuning across up to 32 tasks using LoRA adapters and a disaggregated asynchronous architecture, matching single-task performance while improving accelerator utilization by 4.3x and cutting end-to-end time by 85%.

Where's the Plan? Locating Latent Planning in Language Models with Lightweight Mechanistic Interventions

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

Future-rhyme information is linearly decodable at line boundaries across model families and strengthens with scale, yet only Gemma-3-27B causally depends on it, with the driver migrating to the boundary around layer 30 and localizing to five attention heads.

Accelerating Langevin Monte Carlo via Efficient Stochastic Runge--Kutta Methods beyond Log-Concavity

math.ST · 2026-05-08 · unverdicted · novelty 6.0

A Hessian-free stochastic Runge-Kutta LMC algorithm achieves strong order 1.5 with two gradient evaluations per step and uniform-in-time convergence O(d^{3/2} h^{3/2}) in non-log-concave settings.

NPMixer: Hierarchical Neighboring Patch Mixing for Time Series Forecasting

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

NPMixer improves multivariate time series forecasting accuracy by combining a data-adaptive wavelet decomposition with hierarchical neighboring patch mixing via MLPs and channel mixing on high-frequency components.

ExecuTorch -- A Unified PyTorch Solution to Run AI Models On-Device

cs.LG · 2026-05-05 · unverdicted · novelty 6.0

ExecuTorch is a unified PyTorch-native deployment framework that enables seamless on-device execution of AI models across heterogeneous hardware while preserving original PyTorch semantics.

Behavioral Mode Discovery for Fine-tuning Multimodal Generative Policies

cs.LG · 2026-05-12 · unverdicted · novelty 5.0

Unsupervised behavioral mode discovery combined with mutual information rewards enables RL fine-tuning of multimodal generative policies that achieves higher success rates without losing action diversity.

SynerMedGen: Synergizing Medical Multimodal Understanding with Generation via Task Alignment

cs.CV · 2026-05-09 · unverdicted · novelty 5.0

SynerMedGen introduces generation-aligned understanding tasks and a two-stage training strategy that enables strong zero-shot medical image synthesis performance and outperforms specialized models when generation training is added.

Insider Attacks in Multi-Agent LLM Consensus Systems

cs.MA · 2026-05-08 · unverdicted · novelty 5.0

A malicious agent in multi-agent LLM consensus systems can be trained via a surrogate world model and RL to reduce consensus rates and prolong disagreement more effectively than direct prompt attacks.

citing papers explorer

Showing 29 of 29 citing papers.

Many Needles in a Haystack: Active Hit Discovery for Perturbation Experiments cs.LG · 2026-05-11 · unverdicted · none · ref 4
Probability-of-Hit acquisition function ranks perturbation candidates by posterior probability of threshold exceedance, with asymptotic optimality proof and up to 6.4% gains on real immunology data.
Fix the Loss, Not the Radius: Rethinking the Adversarial Perturbation of Sharpness-Aware Minimization cs.LG · 2026-05-11 · unverdicted · none · ref 9
LE-SAM inverts SAM by fixing the loss budget instead of the parameter-space radius, yielding better generalization across benchmarks.
ResRL: Boosting LLM Reasoning via Negative Sample Projection Residual Reinforcement Learning cs.LG · 2026-05-01 · unverdicted · none · ref 4
ResRL decouples shared semantics between positive and negative responses in LLM reinforcement learning via SVD-based projection residuals, outperforming baselines including NSR by up to 9.4% on math reasoning benchmarks.
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs cs.CL · 2024-12-30 · unverdicted · none · ref 153
o1-like models overthink easy tasks; self-training reduces compute use without accuracy loss on GSM8K, MATH500, GPQA, and AIME.
Fast Transformer Decoding: One Write-Head is All You Need cs.NE · 2019-11-06 · unverdicted · none · ref 10
Multi-query attention shares keys and values across heads in Transformers, greatly reducing memory bandwidth for faster decoding with only minor quality loss.
Frequency Bias and OOD Generalization in Neural Operators under a Variable-Coefficient Wave Equation cs.LG · 2026-05-13 · unverdicted · none · ref 4
FNO exhibits strong frequency bias with sharp OOD error growth on high-frequency inputs in wave equations, while DeepONet shows milder degradation despite higher baseline error.
TokenRatio: Principled Token-Level Preference Optimization via Ratio Matching cs.CL · 2026-05-12 · unverdicted · none · ref 45
TBPO derives a token-level preference optimization objective from sequence-level pairwise data via Bregman divergence ratio matching that generalizes DPO and improves alignment quality.
CTFusion: A CTF-based Benchmark for LLM Agent Evaluation cs.LG · 2026-05-12 · unverdicted · none · ref 41
CTFusion is a live-CTF streaming benchmark that prevents data contamination by forwarding only the first correct flag per challenge under a shared team account.
Learning Graph Foundation Models on Riemannian Graph-of-Graphs cs.LG · 2026-05-11 · unverdicted · none · ref 12
R-GFM constructs multi-scale Riemannian graph-of-graphs to learn geometry-adaptive representations, reducing structural domain generalization error and delivering up to 49% relative gains on downstream graph tasks.
The general regularisation scheme applied to conditional density estimation math.ST · 2026-05-10 · unverdicted · none · ref 56
The general regularization scheme is extended to conditional density estimation, yielding a new estimator with proven convergence rates that matches or beats the Nadaraya-Watson estimator in experiments.
Modeling Implicit Conflict Monitoring Mechanisms against Stereotypes in LLMs cs.SI · 2026-05-10 · unverdicted · none · ref 84
LLMs contain identifiable COCO neurons that enable implicit self-correction against stereotypes; targeted editing of these neurons improves fairness and robustness to jailbreaks while preserving generation quality.
Minimal Filling Architectures of Polynomial Neural Networks: Counterexamples, Frontier Search, and Defects cs.LG · 2026-05-10 · unverdicted · none · ref 29
A counterexample disproves the conjecture that minimal filling architectures of polynomial neural networks always have unimodal hidden layer widths.
QueST: Persistent Queries as Semantic Monitors for Drift Suppression in Long-Horizon Tracking cs.CV · 2026-05-10 · unverdicted · none · ref 4
QueST replaces local point tracking with persistent semantic queries that globally attend to spatio-temporal features and apply 3D grounding to suppress drift, cutting absolute point error by 67.7% versus TAP-Net on long articulated sequences.
From Passive Reuse to Active Reasoning: Grounding Large Language Models for Neuro-Symbolic Experience Replay cs.AI · 2026-05-10 · unverdicted · none · ref 5
NSER uses zero-shot LLMs to induce behavioral rules from RL trajectories, grounds them in differentiable first-order logic, and applies the symbolic structures to dynamically reweight experience replay for better sample efficiency.
Adversary-Robust Learning from Fully Asynchronous Directional Derivative Estimates cs.LG · 2026-05-10 · unverdicted · none · ref 154
FAR-SIGN achieves adversary-resilient fully asynchronous optimization via signed directional projections and two-timescale correction, with almost-sure convergence to stationary points at rates O(n^{-1/4+ε}) first-order and O(n^{-1/6+ε}) zeroth-order.
Kinematics-Driven Gaussian Shape Deformation for Blurry Monocular Dynamic Scenes cs.CV · 2026-05-09 · unverdicted · none · ref 4
Kinematics-GS reparameterizes Gaussian shapes along motion trajectories with a kinematic prior to reconstruct dynamic 3D scenes from blurry monocular videos by separating dynamic and static components and using coarse-to-fine optimization.
MARLaaS: Multi-Tenant Asynchronous Reinforcement Learning as a Service cs.DC · 2026-05-08 · unverdicted · none · ref 41
MARLaaS enables concurrent RL fine-tuning across up to 32 tasks using LoRA adapters and a disaggregated asynchronous architecture, matching single-task performance while improving accelerator utilization by 4.3x and cutting end-to-end time by 85%.
Where's the Plan? Locating Latent Planning in Language Models with Lightweight Mechanistic Interventions cs.LG · 2026-05-08 · unverdicted · none · ref 32
Future-rhyme information is linearly decodable at line boundaries across model families and strengthens with scale, yet only Gemma-3-27B causally depends on it, with the driver migrating to the boundary around layer 30 and localizing to five attention heads.
Accelerating Langevin Monte Carlo via Efficient Stochastic Runge--Kutta Methods beyond Log-Concavity math.ST · 2026-05-08 · unverdicted · none · ref 4
A Hessian-free stochastic Runge-Kutta LMC algorithm achieves strong order 1.5 with two gradient evaluations per step and uniform-in-time convergence O(d^{3/2} h^{3/2}) in non-log-concave settings.
NPMixer: Hierarchical Neighboring Patch Mixing for Time Series Forecasting cs.LG · 2026-05-08 · unverdicted · none · ref 4
NPMixer improves multivariate time series forecasting accuracy by combining a data-adaptive wavelet decomposition with hierarchical neighboring patch mixing via MLPs and channel mixing on high-frequency components.
ExecuTorch -- A Unified PyTorch Solution to Run AI Models On-Device cs.LG · 2026-05-05 · unverdicted · none · ref 4
ExecuTorch is a unified PyTorch-native deployment framework that enables seamless on-device execution of AI models across heterogeneous hardware while preserving original PyTorch semantics.
Behavioral Mode Discovery for Fine-tuning Multimodal Generative Policies cs.LG · 2026-05-12 · unverdicted · none · ref 4
Unsupervised behavioral mode discovery combined with mutual information rewards enables RL fine-tuning of multimodal generative policies that achieves higher success rates without losing action diversity.
SynerMedGen: Synergizing Medical Multimodal Understanding with Generation via Task Alignment cs.CV · 2026-05-09 · unverdicted · none · ref 4
SynerMedGen introduces generation-aligned understanding tasks and a two-stage training strategy that enables strong zero-shot medical image synthesis performance and outperforms specialized models when generation training is added.
Insider Attacks in Multi-Agent LLM Consensus Systems cs.MA · 2026-05-08 · unverdicted · none · ref 4
A malicious agent in multi-agent LLM consensus systems can be trained via a surrogate world model and RL to reduce consensus rates and prolong disagreement more effectively than direct prompt attacks.
InfoGeo: Information-Theoretic Object-Centric Learning for Cross-View Generalizable UAV Geo-Localization cs.CV · 2026-05-08 · unverdicted · none · ref 4
InfoGeo reformulates cross-view geo-localization as an information bottleneck that aligns object-centric structural relations across views while minimizing view-specific noise.
Structured Diffusion Bridges: Inductive Bias for Denoising Diffusion Bridges cs.LG · 2026-05-03 · unverdicted · none · ref 4
A structured diffusion bridge method achieves near fully-paired modality translation quality using alignment constraints even in unpaired or semi-paired regimes.
Self-Captioning Multimodal Interaction Tuning: Amplifying Exploitable Redundancies for Robust Vision Language Models cs.CV · 2026-05-03 · unverdicted · none · ref 4
A self-captioning method using a Multimodal Interaction Gate amplifies redundant interactions to reduce visual-induced errors by 38.3% and improve consistency by 16.8% in vision-language models.
Revitalizing the Beginning: Avoiding Storage Dependency for Model Merging in Continual Learning cs.LG · 2026-05-08 · unverdicted · none · ref 4
The paper proposes Trajectory Regularized Merging (TRM) to enable storage-free model merging in continual learning by optimizing in an augmented trajectory subspace with task alignment, prediction consistency, and gradient responsiveness objectives, claiming SOTA results.
OUI as a Structural Observable: Towards an Activation-Centric View of Neural Network Training cs.LG · 2026-05-12 · unverdicted · none · ref 4
OUI provides an activation-based observable that anticipates training regimes across supervised learning, reinforcement learning, and control tasks before convergence occurs.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer