archive
Every paper Pith has read. Search by title, abstract, or pith.
14513 papers in cs.AI · page 2
-
Reflection symmetry speeds up state-based RL
Reflex: Reinforcement Learning with Reflection Symmetry Exploitation in State-Based Continuous Control
-
Consistency checks raise LLM multi-agent planning success 9.75%
When Planning Fails Despite Correct Execution: On Epistemic Calibration for LLM-Based Multi-Agent Systems
-
3D CNNs spot and name hand gestures in live video
Online Hand Gesture Recognition Using 3D Convolutional Neural Networks
-
PPM maps parametric priors into generative forecasts
Parametric Prior Mapping Framework for Non-stationary Probabilistic Time Series Forecasting
-
One recursion decomposes every component into paths and token credit
Every Component is a Lookup: Token Attribution and Composition from a Single Decomposition
-
Metacognitive rewards lift LLM reasoning up to 11 percent
Metacognition as Reward: Reinforcing LLM Reasoning via Knowledge and Regulation Signals
-
VAE turns non-Euclidean tasks into measurable space for RL curricula
Curriculum reinforcement learning with measurable task representation learning
-
One-step MeanFlow policy hits SOTA on locomotion tasks
Score-Based One-step MeanFlow Policy Optimization
-
Router trims LLM inference tail latency by 52% at wind farms
XWind: A Cross-site Router for Large Language Model Inference Serving at Renewable Energy Farms
-
Uncertainty gate activates contrastive decoding only on risky tokens
CHASD: Language Increment-Calibrated Contrastive Decoding against Hallucination in LVLMs
-
Assembling trajectories from primitives reduces error ratio to 1.07
Sparse Compositional Flow Matching by geometric assembly from motion primitives
-
Contextual bandit adapts ventilator choices to clinician style
Human-in-the-Loop Multi-Agent Ventilator Decision Support with Contextual Bandit Preference Learning
-
Models converge on representations but diverge on reasoning
Convergence Without Understanding: When Language Models Agree on Representations but Disagree on Reasoning
-
Semantic check enables safe rollback for tool agents
DART: Semantic Recoverability for Structured Tool Agents
-
OKBs compile AI regulations into executable validation modules
Ontological Knowledge Blocks: Executable Compliance and Profile-Based Validation for Trustworthy AI Systems
-
Parallel compaction reduces LLM agent wall time with predictable summaries
Parallel Context Compaction for Long-Horizon LLM Agent Serving
-
Reinforcement learning enforces exact graph assortativity without tuning
Reinforcement Learning for Microcanonical Graph Ensemble with Assortativity Constraints
-
Structural priors fix bad scores for good equations in symbolic regression
When Good Equations Get Bad Scores: Improving Symbolic Regression Through Better Parameter Optimization
-
EvalVerse calibrates VLMs to expert cinematic video standards
EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation
-
Hybrid planner reaches 94.85 on NAVSIM
ChainFlow-VLA: Causal Flow Planning with Vision-Language Models
-
Coloring noise in Sobolev space fixes SR spectral mismatch
Coloring the Noise: Adversarial Sobolev Alignment for Faithful Image Super Resolution
-
Four-layer network lets robots respond to humans at millisecond speed
6G Communication Networks Enabling Embodied Agents: Architecture and Prototype
-
Three steps tie benchmark scores to real knowledge-work claims
Design and Report Benchmarks for Knowledge Work
-
Multi-gate residuals stabilize deep nets without extra comms cost
Multi-Gate Residuals
-
RefCal jointly optimizes calibration and refinement in DNNs
Enhancing Deep Neural Network Reliability with Refinement and Calibration
-
Single-frame edit extends across video via diffusion priors
SimInsert: Seamless Video Object Insertion via Regional Sparse Attention Fusion
-
Frontier LLMs cover only 4-8% of real vulnerabilities in black-box tests
Are Frontier LLMs Ready for Cybersecurity? Evidence for Vertical Foundation Models from Dual-Mode Vulnerability Benchmarks
-
Generated card games expose jagged strategic skills in top LLMs
GENSTRAT: Toward a Science of Strategic Reasoning in Large Language Models
-
Prefix prompts let frozen LLMs condition flows for multi-modal forecasts
PaP-NF: Probabilistic Long-Term Time Series Forecasting via Prefix-as-Prompt Reprogramming and Normalizing Flows
-
Graph protocol coordinates agents with humans and institutions
Foundation Protocol: A Coordination Layer for Agentic Society
-
Kernel agents top out at 0.94x production baselines
FastKernels: Benchmarking GPU Kernel Generation in Production
-
AI research automation credible only in structured domains
AutoResearch AI: Towards AI-Powered Research Automation for Scientific Discovery
-
Homography mapping yields linear bounds for camera motion verification
Lipschitz Optimization for Formal Verification of Homographies
-
Region quotas stop wipe-out of reasoning blocks in KV caches
Adaptive Mass-Segmented KV Compression for Long-Context Reasoning
-
Pretrained graph model improves low-data OPF accuracy
Scalable Heterogeneous Graph Foundation Models for Data-Driven Optimal Power Flow in Smart Grids
-
Accountability can keep AI capabilities integrated even when tech modularizes
Redrawing the AI Map: A Theory of Accountability Boundaries in Agentic Ecosystems
-
Symmetric noise lifts AlpacaEval scores from 65% to 69% in fine-tuning
Understanding and Improving Noisy Embedding Techniques in Instruction Finetuning
-
LLMs drop up to 88 points when tasks move to context middle
Positional Failures in Long-Context LLMs: A Blind Spot in Reasoning Benchmarks
-
10 poisoned examples hijack targeted LLM tasks at 70%+ success
PoisonForge: Task-Level Targeted Poisoning Benchmark for Instruction-Tuned LLMs
-
VLM boosts robot map coverage by 24% in tests
Autonomous Frontier-Based Exploration with VLM Guidance
-
Firms lower AI job exposure mostly by shifting hires across roles
Generative AI and the Reorganization of Labor Demand
-
Role prompts split into additive persona and task vectors at one site
As X, Do Y: How Persona and Task Combine in Instruction-Tuned LLMs
-
Infra-Bayesian RL records lower worst-case regret than classical agents
Infra-Bayesian Reinforcement Learning Agents Outperform Classical RL For Worst-Case Robustness
-
Channel relevance steers contrastive samples for time series anomaly detection
CALAD: Channel-Aware contrastive Learning for multivariate time series Anomaly Detection
-
RL selects Clifford states that boost VQA energy accuracy 3x on average
Classical State Preparation for Variational Quantum Algorithms via Reinforcement Learning
-
Five dimensions of AI fatigue emerge from student accounts
Defining AI Fatigue in Academic Contexts: Dimensions, Indicators, and a Stage-Based Model Using Grounded Theory
-
Verified prompts plus longitudinal context raise lesion tracking Dice by 4.5 points
Exploiting Longitudinal Context in Clinician-Verified Interactive Lesion Tracking
-
One frozen VLM detects video anomalies without training
CoReVAD: A Contextual Reasoning Framework for Training-Free Video Anomaly Detection
-
AI agent produces verified distributed systems on all 7 tests
Inductive Deductive Synthesis: Enabling AI to Generate Formally Verified Systems
-
Philosophical dispositions produce 51% unique AI code review findings
Philosophical Dispositions as Behavioral Constraints for AI-Assisted Code Review: An Empirical Study