Probability-of-Hit acquisition function ranks perturbation candidates by posterior probability of threshold exceedance, with asymptotic optimality proof and up to 6.4% gains on real immunology data.
Title resolution pending
29 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 29representative citing papers
LE-SAM inverts SAM by fixing the loss budget instead of the parameter-space radius, yielding better generalization across benchmarks.
ResRL decouples shared semantics between positive and negative responses in LLM reinforcement learning via SVD-based projection residuals, outperforming baselines including NSR by up to 9.4% on math reasoning benchmarks.
o1-like models overthink easy tasks; self-training reduces compute use without accuracy loss on GSM8K, MATH500, GPQA, and AIME.
Multi-query attention shares keys and values across heads in Transformers, greatly reducing memory bandwidth for faster decoding with only minor quality loss.
FNO exhibits strong frequency bias with sharp OOD error growth on high-frequency inputs in wave equations, while DeepONet shows milder degradation despite higher baseline error.
TBPO derives a token-level preference optimization objective from sequence-level pairwise data via Bregman divergence ratio matching that generalizes DPO and improves alignment quality.
CTFusion is a live-CTF streaming benchmark that prevents data contamination by forwarding only the first correct flag per challenge under a shared team account.
R-GFM constructs multi-scale Riemannian graph-of-graphs to learn geometry-adaptive representations, reducing structural domain generalization error and delivering up to 49% relative gains on downstream graph tasks.
The general regularization scheme is extended to conditional density estimation, yielding a new estimator with proven convergence rates that matches or beats the Nadaraya-Watson estimator in experiments.
LLMs contain identifiable COCO neurons that enable implicit self-correction against stereotypes; targeted editing of these neurons improves fairness and robustness to jailbreaks while preserving generation quality.
A counterexample disproves the conjecture that minimal filling architectures of polynomial neural networks always have unimodal hidden layer widths.
QueST replaces local point tracking with persistent semantic queries that globally attend to spatio-temporal features and apply 3D grounding to suppress drift, cutting absolute point error by 67.7% versus TAP-Net on long articulated sequences.
NSER uses zero-shot LLMs to induce behavioral rules from RL trajectories, grounds them in differentiable first-order logic, and applies the symbolic structures to dynamically reweight experience replay for better sample efficiency.
FAR-SIGN achieves adversary-resilient fully asynchronous optimization via signed directional projections and two-timescale correction, with almost-sure convergence to stationary points at rates O(n^{-1/4+ε}) first-order and O(n^{-1/6+ε}) zeroth-order.
Kinematics-GS reparameterizes Gaussian shapes along motion trajectories with a kinematic prior to reconstruct dynamic 3D scenes from blurry monocular videos by separating dynamic and static components and using coarse-to-fine optimization.
MARLaaS enables concurrent RL fine-tuning across up to 32 tasks using LoRA adapters and a disaggregated asynchronous architecture, matching single-task performance while improving accelerator utilization by 4.3x and cutting end-to-end time by 85%.
Future-rhyme information is linearly decodable at line boundaries across model families and strengthens with scale, yet only Gemma-3-27B causally depends on it, with the driver migrating to the boundary around layer 30 and localizing to five attention heads.
A Hessian-free stochastic Runge-Kutta LMC algorithm achieves strong order 1.5 with two gradient evaluations per step and uniform-in-time convergence O(d^{3/2} h^{3/2}) in non-log-concave settings.
NPMixer improves multivariate time series forecasting accuracy by combining a data-adaptive wavelet decomposition with hierarchical neighboring patch mixing via MLPs and channel mixing on high-frequency components.
ExecuTorch is a unified PyTorch-native deployment framework that enables seamless on-device execution of AI models across heterogeneous hardware while preserving original PyTorch semantics.
Unsupervised behavioral mode discovery combined with mutual information rewards enables RL fine-tuning of multimodal generative policies that achieves higher success rates without losing action diversity.
SynerMedGen introduces generation-aligned understanding tasks and a two-stage training strategy that enables strong zero-shot medical image synthesis performance and outperforms specialized models when generation training is added.
A malicious agent in multi-agent LLM consensus systems can be trained via a surrogate world model and RL to reduce consensus rates and prolong disagreement more effectively than direct prompt attacks.
citing papers explorer
-
Many Needles in a Haystack: Active Hit Discovery for Perturbation Experiments
Probability-of-Hit acquisition function ranks perturbation candidates by posterior probability of threshold exceedance, with asymptotic optimality proof and up to 6.4% gains on real immunology data.
-
Fix the Loss, Not the Radius: Rethinking the Adversarial Perturbation of Sharpness-Aware Minimization
LE-SAM inverts SAM by fixing the loss budget instead of the parameter-space radius, yielding better generalization across benchmarks.
-
ResRL: Boosting LLM Reasoning via Negative Sample Projection Residual Reinforcement Learning
ResRL decouples shared semantics between positive and negative responses in LLM reinforcement learning via SVD-based projection residuals, outperforming baselines including NSR by up to 9.4% on math reasoning benchmarks.
-
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs
o1-like models overthink easy tasks; self-training reduces compute use without accuracy loss on GSM8K, MATH500, GPQA, and AIME.
-
Fast Transformer Decoding: One Write-Head is All You Need
Multi-query attention shares keys and values across heads in Transformers, greatly reducing memory bandwidth for faster decoding with only minor quality loss.
-
Frequency Bias and OOD Generalization in Neural Operators under a Variable-Coefficient Wave Equation
FNO exhibits strong frequency bias with sharp OOD error growth on high-frequency inputs in wave equations, while DeepONet shows milder degradation despite higher baseline error.
-
TokenRatio: Principled Token-Level Preference Optimization via Ratio Matching
TBPO derives a token-level preference optimization objective from sequence-level pairwise data via Bregman divergence ratio matching that generalizes DPO and improves alignment quality.
-
CTFusion: A CTF-based Benchmark for LLM Agent Evaluation
CTFusion is a live-CTF streaming benchmark that prevents data contamination by forwarding only the first correct flag per challenge under a shared team account.
-
Learning Graph Foundation Models on Riemannian Graph-of-Graphs
R-GFM constructs multi-scale Riemannian graph-of-graphs to learn geometry-adaptive representations, reducing structural domain generalization error and delivering up to 49% relative gains on downstream graph tasks.
-
The general regularisation scheme applied to conditional density estimation
The general regularization scheme is extended to conditional density estimation, yielding a new estimator with proven convergence rates that matches or beats the Nadaraya-Watson estimator in experiments.
-
Modeling Implicit Conflict Monitoring Mechanisms against Stereotypes in LLMs
LLMs contain identifiable COCO neurons that enable implicit self-correction against stereotypes; targeted editing of these neurons improves fairness and robustness to jailbreaks while preserving generation quality.
-
Minimal Filling Architectures of Polynomial Neural Networks: Counterexamples, Frontier Search, and Defects
A counterexample disproves the conjecture that minimal filling architectures of polynomial neural networks always have unimodal hidden layer widths.
-
QueST: Persistent Queries as Semantic Monitors for Drift Suppression in Long-Horizon Tracking
QueST replaces local point tracking with persistent semantic queries that globally attend to spatio-temporal features and apply 3D grounding to suppress drift, cutting absolute point error by 67.7% versus TAP-Net on long articulated sequences.
-
From Passive Reuse to Active Reasoning: Grounding Large Language Models for Neuro-Symbolic Experience Replay
NSER uses zero-shot LLMs to induce behavioral rules from RL trajectories, grounds them in differentiable first-order logic, and applies the symbolic structures to dynamically reweight experience replay for better sample efficiency.
-
Adversary-Robust Learning from Fully Asynchronous Directional Derivative Estimates
FAR-SIGN achieves adversary-resilient fully asynchronous optimization via signed directional projections and two-timescale correction, with almost-sure convergence to stationary points at rates O(n^{-1/4+ε}) first-order and O(n^{-1/6+ε}) zeroth-order.
-
Kinematics-Driven Gaussian Shape Deformation for Blurry Monocular Dynamic Scenes
Kinematics-GS reparameterizes Gaussian shapes along motion trajectories with a kinematic prior to reconstruct dynamic 3D scenes from blurry monocular videos by separating dynamic and static components and using coarse-to-fine optimization.
-
MARLaaS: Multi-Tenant Asynchronous Reinforcement Learning as a Service
MARLaaS enables concurrent RL fine-tuning across up to 32 tasks using LoRA adapters and a disaggregated asynchronous architecture, matching single-task performance while improving accelerator utilization by 4.3x and cutting end-to-end time by 85%.
-
Where's the Plan? Locating Latent Planning in Language Models with Lightweight Mechanistic Interventions
Future-rhyme information is linearly decodable at line boundaries across model families and strengthens with scale, yet only Gemma-3-27B causally depends on it, with the driver migrating to the boundary around layer 30 and localizing to five attention heads.
-
Accelerating Langevin Monte Carlo via Efficient Stochastic Runge--Kutta Methods beyond Log-Concavity
A Hessian-free stochastic Runge-Kutta LMC algorithm achieves strong order 1.5 with two gradient evaluations per step and uniform-in-time convergence O(d^{3/2} h^{3/2}) in non-log-concave settings.
-
NPMixer: Hierarchical Neighboring Patch Mixing for Time Series Forecasting
NPMixer improves multivariate time series forecasting accuracy by combining a data-adaptive wavelet decomposition with hierarchical neighboring patch mixing via MLPs and channel mixing on high-frequency components.
-
ExecuTorch -- A Unified PyTorch Solution to Run AI Models On-Device
ExecuTorch is a unified PyTorch-native deployment framework that enables seamless on-device execution of AI models across heterogeneous hardware while preserving original PyTorch semantics.
-
Behavioral Mode Discovery for Fine-tuning Multimodal Generative Policies
Unsupervised behavioral mode discovery combined with mutual information rewards enables RL fine-tuning of multimodal generative policies that achieves higher success rates without losing action diversity.
-
SynerMedGen: Synergizing Medical Multimodal Understanding with Generation via Task Alignment
SynerMedGen introduces generation-aligned understanding tasks and a two-stage training strategy that enables strong zero-shot medical image synthesis performance and outperforms specialized models when generation training is added.
-
Insider Attacks in Multi-Agent LLM Consensus Systems
A malicious agent in multi-agent LLM consensus systems can be trained via a surrogate world model and RL to reduce consensus rates and prolong disagreement more effectively than direct prompt attacks.
-
InfoGeo: Information-Theoretic Object-Centric Learning for Cross-View Generalizable UAV Geo-Localization
InfoGeo reformulates cross-view geo-localization as an information bottleneck that aligns object-centric structural relations across views while minimizing view-specific noise.
-
Structured Diffusion Bridges: Inductive Bias for Denoising Diffusion Bridges
A structured diffusion bridge method achieves near fully-paired modality translation quality using alignment constraints even in unpaired or semi-paired regimes.
-
Self-Captioning Multimodal Interaction Tuning: Amplifying Exploitable Redundancies for Robust Vision Language Models
A self-captioning method using a Multimodal Interaction Gate amplifies redundant interactions to reduce visual-induced errors by 38.3% and improve consistency by 16.8% in vision-language models.
-
Revitalizing the Beginning: Avoiding Storage Dependency for Model Merging in Continual Learning
The paper proposes Trajectory Regularized Merging (TRM) to enable storage-free model merging in continual learning by optimizing in an augmented trajectory subspace with task alignment, prediction consistency, and gradient responsiveness objectives, claiming SOTA results.
-
OUI as a Structural Observable: Towards an Activation-Centric View of Neural Network Training
OUI provides an activation-based observable that anticipates training regimes across supervised learning, reinforcement learning, and control tasks before convergence occurs.