pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

14513 papers in cs.AI · page 2

  1. cs.LG 2026-05-22 reviewed
    Reflection symmetry speeds up state-based RL

    Reflex: Reinforcement Learning with Reflection Symmetry Exploitation in State-Based Continuous Control

    Shuai Zhen +3

  2. cs.AI 2026-05-22 reviewed
    Consistency checks raise LLM multi-agent planning success 9.75%

    When Planning Fails Despite Correct Execution: On Epistemic Calibration for LLM-Based Multi-Agent Systems

    Zehao Wang +3

  3. cs.CV 2026-05-22 reviewed
    3D CNNs spot and name hand gestures in live video

    Online Hand Gesture Recognition Using 3D Convolutional Neural Networks

    Yinghao Qin +1

  4. cs.LG 2026-05-22 reviewed
    PPM maps parametric priors into generative forecasts

    Parametric Prior Mapping Framework for Non-stationary Probabilistic Time Series Forecasting

    Jinglin Li +3

  5. cs.LG 2026-05-22 reviewed
    One recursion decomposes every component into paths and token credit

    Every Component is a Lookup: Token Attribution and Composition from a Single Decomposition

    Po-Kai Chen +2

  6. cs.CL 2026-05-22 reviewed
    Metacognitive rewards lift LLM reasoning up to 11 percent

    Metacognition as Reward: Reinforcing LLM Reasoning via Knowledge and Regulation Signals

    Sirui Chen +8

  7. cs.LG 2026-05-22 reviewed
    VAE turns non-Euclidean tasks into measurable space for RL curricula

    Curriculum reinforcement learning with measurable task representation learning

    Yongyan Wen +5

  8. cs.LG 2026-05-22 reviewed
    One-step MeanFlow policy hits SOTA on locomotion tasks

    Score-Based One-step MeanFlow Policy Optimization

    Kyungyoon Kim +3

  9. cs.DC 2026-05-22 reviewed
    Router trims LLM inference tail latency by 52% at wind farms

    XWind: A Cross-site Router for Large Language Model Inference Serving at Renewable Energy Farms

    Tella Rajashekhar Reddy +9

  10. cs.CV 2026-05-22 reviewed
    Uncertainty gate activates contrastive decoding only on risky tokens

    CHASD: Language Increment-Calibrated Contrastive Decoding against Hallucination in LVLMs

    Xiaoyi Huang +2

  11. cs.RO 2026-05-22 reviewed
    Assembling trajectories from primitives reduces error ratio to 1.07

    Sparse Compositional Flow Matching by geometric assembly from motion primitives

    Yan Tang +4

  12. cs.AI 2026-05-22 reviewed
    Contextual bandit adapts ventilator choices to clinician style

    Human-in-the-Loop Multi-Agent Ventilator Decision Support with Contextual Bandit Preference Learning

    Sijia Li +9

  13. cs.CL 2026-05-22 reviewed
    Models converge on representations but diverge on reasoning

    Convergence Without Understanding: When Language Models Agree on Representations but Disagree on Reasoning

    Muhammad Usama +1

  14. cs.AI 2026-05-22 reviewed
    Semantic check enables safe rollback for tool agents

    DART: Semantic Recoverability for Structured Tool Agents

    Ke Yang +5

  15. cs.AI 2026-05-22 reviewed
    OKBs compile AI regulations into executable validation modules

    Ontological Knowledge Blocks: Executable Compliance and Profile-Based Validation for Trustworthy AI Systems

    Aasish Kumar Sharma +1

  16. cs.AI 2026-05-22 reviewed
    Parallel compaction reduces LLM agent wall time with predictable summaries

    Parallel Context Compaction for Long-Horizon LLM Agent Serving

    Musa Cim +3

  17. cs.LG 2026-05-22 reviewed
    Reinforcement learning enforces exact graph assortativity without tuning

    Reinforcement Learning for Microcanonical Graph Ensemble with Assortativity Constraints

    Hoyun Choi +2

  18. cs.LG 2026-05-22 reviewed
    Structural priors fix bad scores for good equations in symbolic regression

    When Good Equations Get Bad Scores: Improving Symbolic Regression Through Better Parameter Optimization

    Boxiao Wang +7

  19. cs.CV 2026-05-22 reviewed
    EvalVerse calibrates VLMs to expert cinematic video standards

    EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation

    Songlin Yang +25

  20. cs.CV 2026-05-22 reviewed
    Hybrid planner reaches 94.85 on NAVSIM

    ChainFlow-VLA: Causal Flow Planning with Vision-Language Models

    Xiyang Wang +9

  21. cs.CV 2026-05-22 reviewed
    Coloring noise in Sobolev space fixes SR spectral mismatch

    Coloring the Noise: Adversarial Sobolev Alignment for Faithful Image Super Resolution

    Hongbo Wang +5

  22. cs.RO 2026-05-22 reviewed
    Four-layer network lets robots respond to humans at millisecond speed

    6G Communication Networks Enabling Embodied Agents: Architecture and Prototype

    Lipeng Dai +2

  23. cs.AI 2026-05-22 reviewed
    Three steps tie benchmark scores to real knowledge-work claims

    Design and Report Benchmarks for Knowledge Work

    Yining Hua +3

  24. cs.LG 2026-05-22 reviewed
  25. cs.LG 2026-05-22 reviewed
    RefCal jointly optimizes calibration and refinement in DNNs

    Enhancing Deep Neural Network Reliability with Refinement and Calibration

    Ramya Hebbalaguppe +3

  26. cs.CV 2026-05-22 reviewed
    Single-frame edit extends across video via diffusion priors

    SimInsert: Seamless Video Object Insertion via Regional Sparse Attention Fusion

    Xinyu Chen +11

  27. cs.CR 2026-05-22 reviewed
    Frontier LLMs cover only 4-8% of real vulnerabilities in black-box tests

    Are Frontier LLMs Ready for Cybersecurity? Evidence for Vertical Foundation Models from Dual-Mode Vulnerability Benchmarks

    Vivek Dahiya +4

  28. cs.AI 2026-05-22 reviewed
    Generated card games expose jagged strategic skills in top LLMs

    GENSTRAT: Toward a Science of Strategic Reasoning in Large Language Models

    Vartan Shadarevian +3

  29. cs.LG 2026-05-22 reviewed
    Prefix prompts let frozen LLMs condition flows for multi-modal forecasts

    PaP-NF: Probabilistic Long-Term Time Series Forecasting via Prefix-as-Prompt Reprogramming and Normalizing Flows

    Minju Kim +1

  30. cs.AI 2026-05-22 reviewed
    Graph protocol coordinates agents with humans and institutions

    Foundation Protocol: A Coordination Layer for Agentic Society

    Bang Liu +28

  31. cs.LG 2026-05-22 reviewed
    Kernel agents top out at 0.94x production baselines

    FastKernels: Benchmarking GPU Kernel Generation in Production

    Gabriele Oliaro +7

  32. cs.AI 2026-05-22 reviewed
    AI research automation credible only in structured domains

    AutoResearch AI: Towards AI-Powered Research Automation for Scientific Discovery

    Guiyao Tie +22

  33. cs.CV 2026-05-22 reviewed
    Homography mapping yields linear bounds for camera motion verification

    Lipschitz Optimization for Formal Verification of Homographies

    Jean-Guillaume Durand +3

  34. cs.LG 2026-05-22 reviewed
    Region quotas stop wipe-out of reasoning blocks in KV caches

    Adaptive Mass-Segmented KV Compression for Long-Context Reasoning

    Junzhe Yang +1

  35. cs.LG 2026-05-22 reviewed
    Pretrained graph model improves low-data OPF accuracy

    Scalable Heterogeneous Graph Foundation Models for Data-Driven Optimal Power Flow in Smart Grids

    Massimiliano Lupo Pasini +3

  36. cs.AI 2026-05-22 reviewed
    Accountability can keep AI capabilities integrated even when tech modularizes

    Redrawing the AI Map: A Theory of Accountability Boundaries in Agentic Ecosystems

    Muhammad Zia Hydari +1

  37. cs.LG 2026-05-22 reviewed
    Symmetric noise lifts AlpacaEval scores from 65% to 69% in fine-tuning

    Understanding and Improving Noisy Embedding Techniques in Instruction Finetuning

    Abhay Yadav

  38. cs.CL 2026-05-22 reviewed
    LLMs drop up to 88 points when tasks move to context middle

    Positional Failures in Long-Context LLMs: A Blind Spot in Reasoning Benchmarks

    Chuyifei Zhang +3

  39. cs.CR 2026-05-22 reviewed
    10 poisoned examples hijack targeted LLM tasks at 70%+ success

    PoisonForge: Task-Level Targeted Poisoning Benchmark for Instruction-Tuned LLMs

    Luze Sun +4

  40. cs.RO 2026-05-22 reviewed
    VLM boosts robot map coverage by 24% in tests

    Autonomous Frontier-Based Exploration with VLM Guidance

    Aarush Aitha +1

  41. econ.GN 2026-05-22 reviewed
    Firms lower AI job exposure mostly by shifting hires across roles

    Generative AI and the Reorganization of Labor Demand

    Fangyan Wang +2

  42. cs.CL 2026-05-22 reviewed
    Role prompts split into additive persona and task vectors at one site

    As X, Do Y: How Persona and Task Combine in Instruction-Tuned LLMs

    Eric Xu

  43. cs.LG 2026-05-22 reviewed
    Infra-Bayesian RL records lower worst-case regret than classical agents

    Infra-Bayesian Reinforcement Learning Agents Outperform Classical RL For Worst-Case Robustness

    Manish Aryal +12

  44. cs.LG 2026-05-22 reviewed
    Channel relevance steers contrastive samples for time series anomaly detection

    CALAD: Channel-Aware contrastive Learning for multivariate time series Anomaly Detection

    Jaehyeop Hong +1

  45. quant-ph 2026-05-22 reviewed
    RL selects Clifford states that boost VQA energy accuracy 3x on average

    Classical State Preparation for Variational Quantum Algorithms via Reinforcement Learning

    Gino Kwun +2

  46. cs.CY 2026-05-22 reviewed
    Five dimensions of AI fatigue emerge from student accounts

    Defining AI Fatigue in Academic Contexts: Dimensions, Indicators, and a Stage-Based Model Using Grounded Theory

    John Paul P. Miranda +2

  47. cs.CV 2026-05-22 reviewed
    Verified prompts plus longitudinal context raise lesion tracking Dice by 4.5 points

    Exploiting Longitudinal Context in Clinician-Verified Interactive Lesion Tracking

    Yannick Kirchhoff +7

  48. cs.CV 2026-05-22 reviewed
    One frozen VLM detects video anomalies without training

    CoReVAD: A Contextual Reasoning Framework for Training-Free Video Anomaly Detection

    Hyeongmuk Lim +1

  49. cs.AI 2026-05-22 reviewed
    AI agent produces verified distributed systems on all 7 tests

    Inductive Deductive Synthesis: Enabling AI to Generate Formally Verified Systems

    Shubham Agarwal +12

  50. cs.SE 2026-05-21 reviewed
    Philosophical dispositions produce 51% unique AI code review findings

    Philosophical Dispositions as Behavioral Constraints for AI-Assisted Code Review: An Empirical Study

    Kaushal Bansal