TouchGuide: Inference-Time Steering of Visuomotor Policies via Touch Guidance

Zhemeng Zhang , Jiahua Ma , Xincheng Yang , Xin Wen , Yuzhi Zhang , Boyan Li , Yiran Qin , Jin Liu

show 8 more authors

Can Zhao Li Kang Haoqin Hong Zhenfei Yin Philip Torr Hao Su Ruimao Zhang Daolin Ma

Authors on Pith no claims yet

classification 💻 cs.RO

keywords touchguidetactileactioncontactdataphysicalchallengingcollection

0 comments

read the original abstract

Fine-grained and contact-rich manipulation remain challenging for robots, largely due to the underutilization of tactile feedback. To address this, we introduce TouchGuide, a novel cross-policy visuo-tactile fusion paradigm that fuses modalities within a low-dimensional action space. Specifically, TouchGuide operates in two stages to guide a pre-trained diffusion or flow-matching visuomotor policy at inference time. First, the policy produces a coarse, visually-plausible action using only visual inputs during early sampling. Second, a task-specific Contact Physical Model (CPM) provides tactile guidance to steer and refine the action, ensuring it aligns with realistic physical contact conditions. Trained through contrastive learning on limited expert demonstrations, the CPM provides a tactile-informed feasibility score to steer the sampling process toward refined actions that satisfy physical contact constraints. Furthermore, to facilitate TouchGuide training with high-quality and cost-effective data, we introduce TacUMI, a data collection system. TacUMI achieves a favorable trade-off between precision and affordability; by leveraging rigid fingertips, it obtains direct tactile feedback, thereby enabling the collection of reliable tactile data. Extensive experiments on five challenging contact-rich tasks, such as shoe lacing and chip handover, show that TouchGuide consistently and significantly outperforms state-of-the-art visuo-tactile policies.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Referring-Aware Visuomotor Policy Learning for Closed-Loop Manipulation
cs.RO 2026-04 unverdicted novelty 7.0

ReV is a referring-aware visuomotor policy using coupled diffusion heads for real-time trajectory replanning in robotic manipulation, trained solely via targeted perturbations to expert demonstrations and achieving hi...
Learning Tactile-Aware Quadrupedal Loco-Manipulation Policies
cs.RO 2026-04 unverdicted novelty 6.0

A tactile-aware hierarchical policy for quadrupedal loco-manipulation improves real-world contact-rich task performance by 28.54% over vision-only and visuotactile baselines.
TAMEn: Tactile-Aware Manipulation Engine for Closed-Loop Data Collection in Contact-Rich Tasks
cs.RO 2026-04 unverdicted novelty 6.0

TAMEn supplies a cross-morphology wearable interface and pyramid-structured visuo-tactile data regime that raises bimanual manipulation success rates from 34% to 75% via closed-loop collection.
Learning Tactile-Aware Quadrupedal Loco-Manipulation Policies
cs.RO 2026-04 unverdicted novelty 5.0

A hierarchical tactile-aware policy combines human-demonstration training for contact cue prediction with sim-to-real reinforcement learning to improve quadrupedal loco-manipulation performance by 28.54% over vision b...