The work creates NIABench and an LLM-plus-scoring-model framework that enables robots to deliver proactive assistance during human multi-step activities while avoiding interruptions and reducing human effort.
Vcot-grasp: Grasp foundation models with visual chain-of-thought reasoning for language-driven grasp generation
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
A physical agentic loop with execution-state monitoring improves robustness of language-guided grasping over open-loop execution by converting noisy telemetry into discrete outcome events that trigger retries or user escalation.
AVI-HT adaptively fuses vision and IMU data via attention to cut 3D hand keypoint error by 16.1% (24.2% wrist-aligned) on a new 100K+ sample DexGloveHOI dataset in occluded hand-object scenarios.
citing papers explorer
-
Assistance Without Interruption: A Benchmark and LLM-based Framework for Non-Intrusive Human-Robot Assistance
The work creates NIABench and an LLM-plus-scoring-model framework that enables robots to deliver proactive assistance during human multi-step activities while avoiding interruptions and reducing human effort.
-
A Physical Agentic Loop for Language-Guided Grasping with Execution-State Monitoring
A physical agentic loop with execution-state monitoring improves robustness of language-guided grasping over open-loop execution by converting noisy telemetry into discrete outcome events that trigger retries or user escalation.
-
AVI-HT: Adaptive Vision-IMU Fusion for 3D Hand Tracking
AVI-HT adaptively fuses vision and IMU data via attention to cut 3D hand keypoint error by 16.1% (24.2% wrist-aligned) on a new 100K+ sample DexGloveHOI dataset in occluded hand-object scenarios.