pith. sign in

hub Canonical reference

Robomonkey: Scaling test-time sampling and verifi- cation for vision-language-action models

Canonical reference. 100% of citing Pith papers cite this work as background.

19 Pith papers citing it
Background 100% of classified citations

hub tools

citation-role summary

background 5

citation-polarity summary

years

2026 18 2024 1

verdicts

UNVERDICTED 19

roles

background 5

polarities

background 5

clear filters

representative citing papers

Improving Robotic Generalist Policies via Flow Reversal Steering

cs.RO · 2026-06-11 · unverdicted · novelty 7.0

Flow Reversal Steering steers flow matching generalist policies by reversing suboptimal actions to nearby better modes, enabling improved zero-shot control, quick distillation, and RL bootstrapping in robotic manipulation.

When to Act, Ask, or Learn: Uncertainty-Aware Policy Steering

cs.RO · 2026-02-25 · unverdicted · novelty 7.0

UPS framework uses conformal prediction to calibrate VLM verifiers for choosing between high-confidence action execution, natural language task queries, or policy interventions, then applies residual learning from interventions to continually improve the base policy with minimal feedback.

Sequential Planning via Anchored Robotic Keypoints

cs.RO · 2026-06-29 · unverdicted · novelty 6.0

SPARK reaches 43.7% success on six LIBERO-PRO cells by LLM-generated typed behavior trees plus multi-prompt perception and recovery, more than doubling CaP-Agent0 and VLA baselines.

DREAM-Chunk: Reactive Action Chunking with Latent World Model

cs.RO · 2026-06-17 · unverdicted · novelty 6.0

DREAM-Chunk uses test-time sampling and latent-world-model rollouts to select robust action chunks from chunking-based VLA policies, improving performance under stochastic dynamics on simulation and hardware tasks.

FASTER: Value-Guided Sampling for Fast RL

cs.LG · 2026-04-21 · unverdicted · novelty 6.0

FASTER models multi-candidate denoising as an MDP and trains a value function to filter actions early, delivering the performance of full sampling at lower cost in diffusion RL policies.

A Survey on Vision-Language-Action Models for Embodied AI

cs.RO · 2024-05-23 · unverdicted · novelty 6.0

This is the first survey on vision-language-action models, providing a taxonomy across three lines, plus summaries of datasets, simulators, benchmarks, challenges, and future directions in embodied AI.

Position: Good Embodied Reward Models Need Bad Behavior Data

cs.RO · 2026-05-31 · unverdicted · novelty 4.0

Embodied reward models systematically over-reward unsafe, suboptimal, and shortcut robot behaviors due to training on successful data only, and modest inclusion of bad behavior data improves alignment with human preferences.

citing papers explorer

Showing 4 of 4 citing papers after filters.