archive

Every paper Pith has read. Search by title, abstract, or pith.

14513 papers in cs.AI · page 2

cs.LG 2026-05-22 reviewed

Reflection symmetry speeds up state-based RL
Reflex: Reinforcement Learning with Reflection Symmetry Exploitation in State-Based Continuous Control

Shuai Zhen +3
cs.AI 2026-05-22 reviewed

Consistency checks raise LLM multi-agent planning success 9.75%
When Planning Fails Despite Correct Execution: On Epistemic Calibration for LLM-Based Multi-Agent Systems

Zehao Wang +3
cs.CV 2026-05-22 reviewed

3D CNNs spot and name hand gestures in live video
Online Hand Gesture Recognition Using 3D Convolutional Neural Networks

Yinghao Qin +1
cs.LG 2026-05-22 reviewed

PPM maps parametric priors into generative forecasts
Parametric Prior Mapping Framework for Non-stationary Probabilistic Time Series Forecasting

Jinglin Li +3
cs.LG 2026-05-22 reviewed

One recursion decomposes every component into paths and token credit
Every Component is a Lookup: Token Attribution and Composition from a Single Decomposition

Po-Kai Chen +2
cs.CL 2026-05-22 reviewed

Metacognitive rewards lift LLM reasoning up to 11 percent
Metacognition as Reward: Reinforcing LLM Reasoning via Knowledge and Regulation Signals

Sirui Chen +8
cs.LG 2026-05-22 reviewed

VAE turns non-Euclidean tasks into measurable space for RL curricula
Curriculum reinforcement learning with measurable task representation learning

Yongyan Wen +5
cs.LG 2026-05-22 reviewed

One-step MeanFlow policy hits SOTA on locomotion tasks
Score-Based One-step MeanFlow Policy Optimization

Kyungyoon Kim +3
cs.DC 2026-05-22 reviewed

Router trims LLM inference tail latency by 52% at wind farms
XWind: A Cross-site Router for Large Language Model Inference Serving at Renewable Energy Farms

Tella Rajashekhar Reddy +9
cs.CV 2026-05-22 reviewed

Uncertainty gate activates contrastive decoding only on risky tokens
CHASD: Language Increment-Calibrated Contrastive Decoding against Hallucination in LVLMs

Xiaoyi Huang +2
cs.RO 2026-05-22 reviewed

Assembling trajectories from primitives reduces error ratio to 1.07
Sparse Compositional Flow Matching by geometric assembly from motion primitives

Yan Tang +4
cs.AI 2026-05-22 reviewed

Contextual bandit adapts ventilator choices to clinician style
Human-in-the-Loop Multi-Agent Ventilator Decision Support with Contextual Bandit Preference Learning

Sijia Li +9
cs.CL 2026-05-22 reviewed

Models converge on representations but diverge on reasoning
Convergence Without Understanding: When Language Models Agree on Representations but Disagree on Reasoning

Muhammad Usama +1
cs.AI 2026-05-22 reviewed

Semantic check enables safe rollback for tool agents
DART: Semantic Recoverability for Structured Tool Agents

Ke Yang +5
cs.AI 2026-05-22 reviewed

OKBs compile AI regulations into executable validation modules
Ontological Knowledge Blocks: Executable Compliance and Profile-Based Validation for Trustworthy AI Systems

Aasish Kumar Sharma +1
cs.AI 2026-05-22 reviewed

Parallel compaction reduces LLM agent wall time with predictable summaries
Parallel Context Compaction for Long-Horizon LLM Agent Serving

Musa Cim +3
cs.LG 2026-05-22 reviewed

Reinforcement learning enforces exact graph assortativity without tuning
Reinforcement Learning for Microcanonical Graph Ensemble with Assortativity Constraints

Hoyun Choi +2
cs.LG 2026-05-22 reviewed

Structural priors fix bad scores for good equations in symbolic regression
When Good Equations Get Bad Scores: Improving Symbolic Regression Through Better Parameter Optimization

Boxiao Wang +7
cs.CV 2026-05-22 reviewed

EvalVerse calibrates VLMs to expert cinematic video standards
EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation

Songlin Yang +25
cs.CV 2026-05-22 reviewed

Hybrid planner reaches 94.85 on NAVSIM
ChainFlow-VLA: Causal Flow Planning with Vision-Language Models

Xiyang Wang +9
cs.CV 2026-05-22 reviewed

Coloring noise in Sobolev space fixes SR spectral mismatch
Coloring the Noise: Adversarial Sobolev Alignment for Faithful Image Super Resolution

Hongbo Wang +5
cs.RO 2026-05-22 reviewed

Four-layer network lets robots respond to humans at millisecond speed
6G Communication Networks Enabling Embodied Agents: Architecture and Prototype

Lipeng Dai +2
cs.AI 2026-05-22 reviewed

Three steps tie benchmark scores to real knowledge-work claims
Design and Report Benchmarks for Knowledge Work

Yining Hua +3
cs.LG 2026-05-22 reviewed

Multi-gate residuals stabilize deep nets without extra comms cost
Multi-Gate Residuals

Zhizhan Zheng +6
cs.LG 2026-05-22 reviewed

RefCal jointly optimizes calibration and refinement in DNNs
Enhancing Deep Neural Network Reliability with Refinement and Calibration

Ramya Hebbalaguppe +3
cs.CV 2026-05-22 reviewed

Single-frame edit extends across video via diffusion priors
SimInsert: Seamless Video Object Insertion via Regional Sparse Attention Fusion

Xinyu Chen +11
cs.CR 2026-05-22 reviewed

Frontier LLMs cover only 4-8% of real vulnerabilities in black-box tests
Are Frontier LLMs Ready for Cybersecurity? Evidence for Vertical Foundation Models from Dual-Mode Vulnerability Benchmarks

Vivek Dahiya +4
cs.AI 2026-05-22 reviewed

Generated card games expose jagged strategic skills in top LLMs
GENSTRAT: Toward a Science of Strategic Reasoning in Large Language Models

Vartan Shadarevian +3
cs.LG 2026-05-22 reviewed

Prefix prompts let frozen LLMs condition flows for multi-modal forecasts
PaP-NF: Probabilistic Long-Term Time Series Forecasting via Prefix-as-Prompt Reprogramming and Normalizing Flows

Minju Kim +1
cs.AI 2026-05-22 reviewed

Graph protocol coordinates agents with humans and institutions
Foundation Protocol: A Coordination Layer for Agentic Society

Bang Liu +28
cs.LG 2026-05-22 reviewed

Kernel agents top out at 0.94x production baselines
FastKernels: Benchmarking GPU Kernel Generation in Production

Gabriele Oliaro +7
cs.AI 2026-05-22 reviewed

AI research automation credible only in structured domains
AutoResearch AI: Towards AI-Powered Research Automation for Scientific Discovery

Guiyao Tie +22
cs.CV 2026-05-22 reviewed

Homography mapping yields linear bounds for camera motion verification
Lipschitz Optimization for Formal Verification of Homographies

Jean-Guillaume Durand +3
cs.LG 2026-05-22 reviewed

Region quotas stop wipe-out of reasoning blocks in KV caches
Adaptive Mass-Segmented KV Compression for Long-Context Reasoning

Junzhe Yang +1
cs.LG 2026-05-22 reviewed

Pretrained graph model improves low-data OPF accuracy
Scalable Heterogeneous Graph Foundation Models for Data-Driven Optimal Power Flow in Smart Grids

Massimiliano Lupo Pasini +3
cs.AI 2026-05-22 reviewed

Accountability can keep AI capabilities integrated even when tech modularizes
Redrawing the AI Map: A Theory of Accountability Boundaries in Agentic Ecosystems

Muhammad Zia Hydari +1
cs.LG 2026-05-22 reviewed

Symmetric noise lifts AlpacaEval scores from 65% to 69% in fine-tuning
Understanding and Improving Noisy Embedding Techniques in Instruction Finetuning

Abhay Yadav
cs.CL 2026-05-22 reviewed

LLMs drop up to 88 points when tasks move to context middle
Positional Failures in Long-Context LLMs: A Blind Spot in Reasoning Benchmarks

Chuyifei Zhang +3
cs.CR 2026-05-22 reviewed

10 poisoned examples hijack targeted LLM tasks at 70%+ success
PoisonForge: Task-Level Targeted Poisoning Benchmark for Instruction-Tuned LLMs

Luze Sun +4
cs.RO 2026-05-22 reviewed

VLM boosts robot map coverage by 24% in tests
Autonomous Frontier-Based Exploration with VLM Guidance

Aarush Aitha +1
econ.GN 2026-05-22 reviewed

Firms lower AI job exposure mostly by shifting hires across roles
Generative AI and the Reorganization of Labor Demand

Fangyan Wang +2
cs.CL 2026-05-22 reviewed

Role prompts split into additive persona and task vectors at one site
As X, Do Y: How Persona and Task Combine in Instruction-Tuned LLMs

Eric Xu
cs.LG 2026-05-22 reviewed

Infra-Bayesian RL records lower worst-case regret than classical agents
Infra-Bayesian Reinforcement Learning Agents Outperform Classical RL For Worst-Case Robustness

Manish Aryal +12
cs.LG 2026-05-22 reviewed

Channel relevance steers contrastive samples for time series anomaly detection
CALAD: Channel-Aware contrastive Learning for multivariate time series Anomaly Detection

Jaehyeop Hong +1
quant-ph 2026-05-22 reviewed

RL selects Clifford states that boost VQA energy accuracy 3x on average
Classical State Preparation for Variational Quantum Algorithms via Reinforcement Learning

Gino Kwun +2
cs.CY 2026-05-22 reviewed

Five dimensions of AI fatigue emerge from student accounts
Defining AI Fatigue in Academic Contexts: Dimensions, Indicators, and a Stage-Based Model Using Grounded Theory

John Paul P. Miranda +2
cs.CV 2026-05-22 reviewed

Verified prompts plus longitudinal context raise lesion tracking Dice by 4.5 points
Exploiting Longitudinal Context in Clinician-Verified Interactive Lesion Tracking

Yannick Kirchhoff +7
cs.CV 2026-05-22 reviewed

One frozen VLM detects video anomalies without training
CoReVAD: A Contextual Reasoning Framework for Training-Free Video Anomaly Detection

Hyeongmuk Lim +1
cs.AI 2026-05-22 reviewed

AI agent produces verified distributed systems on all 7 tests
Inductive Deductive Synthesis: Enabling AI to Generate Formally Verified Systems

Shubham Agarwal +12
cs.SE 2026-05-21 reviewed

Philosophical dispositions produce 51% unique AI code review findings
Philosophical Dispositions as Behavioral Constraints for AI-Assisted Code Review: An Empirical Study

Kaushal Bansal