A vision-language-action-critic model for robotic real-world reinforcement learning.CoRR, abs/2509.15937, 2025

Shaopeng Zhai, Qi Zhang, Tianyi Zhang, Fuxian Huang, Haoran Zhang, Ming Zhou, Shengzhe Zhang, Litao Liu, Sixu Lin, Jiangmiao Pang · 2025 · DOI 10.48550/arxiv.2

22 Pith papers cite this work. Polarity classification is still indexing.

22 Pith papers citing it

open at publisher browse 22 citing papers

citation-role summary

background 4

citation-polarity summary

background 3 support 1

representative citing papers

Apparent Psychological Profiles of Large Language Models are Largely a Measurement Artifact

cs.AI · 2026-06-18 · unverdicted · novelty 7.0

Apparent psychological profiles of LLMs are largely measurement artifacts driven by directional response bias rather than actual traits.

Learn When and Where to Connect: Adaptive Virtual Nodes for Dynamic Message Passing on Graphs

cs.LG · 2026-06-02 · unverdicted · novelty 7.0

MAVN adaptively selects and connects virtual nodes in MPNNs via learned dual-perspective preferences, proves it can realize any connectivity pattern, and reports up to 46.5% gains over backbones on nine datasets.

SkCC: Portable and Secure Skill Compilation for Cross-Framework LLM Agents

cs.CR · 2026-05-05 · unverdicted · novelty 7.0 · 2 refs

SkCC introduces a typed intermediate representation and compiler pipeline to make LLM agent skills portable across frameworks and enforce security constraints before deployment.

RAG-Reflect: Agentic Retrieval-Augmented Generation with Reflections for Comment-Driven Code Maintenance on Stack Overflow

cs.SE · 2026-04-24 · unverdicted · novelty 7.0

RAG-Reflect achieves F1=0.78 on valid comment-edit prediction using retrieval-augmented reasoning and self-reflection, outperforming baselines and approaching fine-tuned models without retraining.

Decision Trace Schema for Governance Evidence in Real-Time Risk Systems

cs.CY · 2026-04-10 · unverdicted · novelty 7.0

The Decision Event Schema (DES) is a unified JSON schema that records governance evidence from four infrastructure layers in a single per-decision event structure with tiered completeness options.

Rank-Then-Act: Reward-Free Control from Frame-Order Progress

cs.LG · 2026-07-02 · unverdicted · novelty 6.0

RTA trains a VLM as a progress ordinal scorer via GRPO on shuffled expert frames and uses Spearman rank correlation with temporal indices as a bounded RL reward, matching or exceeding prior video reward methods on discrete and continuous control benchmarks.

Agentic AI-Powered Re-Identification: An Emerging, Scalable Threat to Mobility Microdata Privacy

cs.CR · 2026-06-26 · conditional · novelty 6.0

Agentic AI re-identifies 72% of individuals from simulated mobility traces by cross-referencing public web sources without human intervention.

Mahalanobis PatchCore: Covariance-Aware and Streaming-Compatible Industrial Anomaly Detection

cs.CV · 2026-05-26 · unverdicted · novelty 6.0

Mahalanobis PatchCore adds covariance-aware whitening and incremental streaming aggregation to PatchCore, preserving benchmark performance while cutting peak memory from 5.41 GB to 2.78 GB and raising mean industrial AUC from 0.981 to 0.986.

Runtime-Orchestrated Second-Order Optimization for Scalable LLM Training

cs.DC · 2026-05-15 · unverdicted · novelty 6.0

Asteria is a runtime system that enables second-order optimization for LLMs by dynamically distributing optimizer state across GPU, CPU, and NVMe while using asynchronous inverse-root computations and bounded-staleness synchronization.

Property-Level Reconstructability of Agent Decisions: An Anchor-Level Pilot Across Vendor SDK Adapter Regimes

cs.SE · 2026-05-12 · unverdicted · novelty 6.0

Pilot study shows agent decision reconstructability varies by vendor SDK regime, with completeness scores from 42.9% to 85.7% and consistent gaps in reasoning traces.

The First Drop of Ink: Nonlinear Impact of Misleading Information in Long-Context Reasoning

cs.AI · 2026-05-11 · unverdicted · novelty 6.0

Hard distractors trigger a nonlinear 'First Drop of Ink' performance collapse in long-context LLM reasoning, with most damage from the initial small fraction via disproportionate attention.

High Precision Hydraulic Excavator Control for Heavy-Duty Grading

cs.RO · 2026-05-10 · unverdicted · novelty 6.0

Autonomous excavator controller achieves 1.8 cm RMSE in heavy-duty grading across different hydraulic architectures, outperforming commercial solutions by a factor of 2.6 in precision while better utilizing machine pressure.

Dual-Guard: Dual-Channel Latent Watermarking for Provenance and Tamper Localization in Diffusion Images

cs.CR · 2026-04-21 · unverdicted · novelty 6.0

Dual-Guard embeds complementary watermarks in diffusion image generation to verify provenance and localize tampering with low error rates on a 2400-sample benchmark under reprompting and editing attacks.

A Study of LLMs' Preferences for Libraries and Programming Languages

cs.SE · 2025-03-21 · unverdicted · novelty 6.0

Empirical study of eight LLMs finds overuse of popular libraries like NumPy in up to 45% of unnecessary cases and strong default preference for Python even when suboptimal.

ATM: CID-Brokered Pre-Write Admission for Multi-Agent Code Co-Synthesis

cs.SE · 2026-06-29 · unverdicted · novelty 5.0

ATM is a CID-brokered governance framework that maps write intents to semantic atoms for pre-admission control, validation, and neutral-steward application in single-domain multi-agent code synthesis.

KAPPS: A knowledge-based CPPS Architecture for the Circular Factory

cs.AI · 2026-05-21 · unverdicted · novelty 5.0

KAPPS is a knowledge-based CPPS architecture that uses an ontology-grounded knowledge graph as the unifying data backbone and authoritative write-time state for handling uncertainty in circular manufacturing, demonstrated via anomaly detection and constraint enforcement use cases.

Governed Auditable Decisioning Under Uncertainty: Synthesis and Agentic Extension

cs.CY · 2026-04-21 · unverdicted · novelty 5.0

Synthesizes a governance evidence framework revealing a coverage gradient from full auditability in rule engines to structural breaks in agentic AI, with a cascade of uncertainty and four formal propositions.

Analysing drivers and interdependencies in European electricity markets using XAI

cs.AI · 2026-06-17 · unverdicted · novelty 4.0

DNNs plus SHAP/SSHAP applied to 39 European bidding zones identify solar and gas as key price drivers and simulate a single-price EU market.

MimirRAG: A Multi-Agent RAG Framework for Financial Data Retrieval with Metadata Integration

cs.LG · 2026-05-24 · unverdicted · novelty 4.0

MimirRAG, a multi-agent RAG framework with metadata integration and table-aware chunking, reaches 89.3% accuracy on FinanceBench and outperforms prior baselines for financial document retrieval.

Decision Evidence Maturity Model for Agentic AI: A Property-Level Method Specification

cs.CY · 2026-04-29 · unverdicted · novelty 4.0

DEMM defines four executable evidence-sufficiency categories plus a conflicting category for agentic AI decisions and rolls per-property verdicts into a five-level maturity rubric.

A Cloud-Native Architecture for Human-in-Control LLM-Assisted OpenSearch in Investigative Settings

cs.DC · 2026-04-22 · unverdicted · novelty 4.0

A human-in-control LLM architecture translates natural language to OpenSearch DSL queries using hybrid lexical and semantic search in a secure private-cloud setup, shown via prototype on the Enron dataset.

Evaluating LLM-Generated ACSL Annotations for Formal Verification

cs.SE · 2026-02-14 · unverdicted · novelty 4.0

Rule-based annotation generation for ACSL outperforms LLM-based methods in achieving successful formal verification of C programs.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Learn When and Where to Connect: Adaptive Virtual Nodes for Dynamic Message Passing on Graphs cs.LG · 2026-06-02 · unverdicted · none · ref 61
MAVN adaptively selects and connects virtual nodes in MPNNs via learned dual-perspective preferences, proves it can realize any connectivity pattern, and reports up to 46.5% gains over backbones on nine datasets.
Rank-Then-Act: Reward-Free Control from Frame-Order Progress cs.LG · 2026-07-02 · unverdicted · none · ref 29
RTA trains a VLM as a progress ordinal scorer via GRPO on shuffled expert frames and uses Spearman rank correlation with temporal indices as a bounded RL reward, matching or exceeding prior video reward methods on discrete and continuous control benchmarks.
MimirRAG: A Multi-Agent RAG Framework for Financial Data Retrieval with Metadata Integration cs.LG · 2026-05-24 · unverdicted · none · ref 37
MimirRAG, a multi-agent RAG framework with metadata integration and table-aware chunking, reaches 89.3% accuracy on FinanceBench and outperforms prior baselines for financial document retrieval.

A vision-language-action-critic model for robotic real-world reinforcement learning.CoRR, abs/2509.15937, 2025

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer