DexTeleop-0: Force-Aware Bimanual Dexterous Teleoperation with Ego-Centric Perception towards Shared Autonomy

Haichao Liu; Hyunsun Park; Yuanjiang Xue; Yuyao Jiang; Ziwei Wang

arxiv: 2606.23431 · v1 · pith:3FM2VZAPnew · submitted 2026-06-22 · 💻 cs.RO

DexTeleop-0: Force-Aware Bimanual Dexterous Teleoperation with Ego-Centric Perception towards Shared Autonomy

Haichao Liu , Yuyao Jiang , Hyunsun Park , Yuanjiang Xue , Ziwei Wang This is my paper

Pith reviewed 2026-06-26 08:35 UTC · model grok-4.3

classification 💻 cs.RO

keywords bimanual teleoperationdexterous manipulationtactile sensingforce feedbackembodiment gapJacobian correctionshared autonomy

0 comments

The pith

A tactile-driven adaptation loop in DexTeleop-0 translates coarse human tracking into precise force-compliant commands for bimanual dexterous teleoperation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces DexTeleop-0, a bimanual dexterous teleoperation framework that adds a real-time optimization strategy to handle contact-rich tasks where traditional kinematic mapping falls short. It estimates contact points from fingertip force sensors and uses the operational space Jacobian to generate localized corrections that convert human intent into compliant robotic motion. Evaluations in simulation and on hardware show consistent gains over baselines in grasping, disturbance rejection, and complex manipulation. The approach targets the embodiment gap that limits data collection for high-precision work.

Core claim

By estimating contact points from a tactile-enabled fingertip force-sensing profile and dynamically computing corrections via the operational space Jacobian with respect to joint angle updates, the tactile-driven adaptation strategy bridges the embodiment gap and produces precise, force-compliant commands from coarse teleoperation inputs.

What carries the argument

Tactile-driven adaptation strategy that estimates contact points from force profiles and applies operational space Jacobian corrections in a real-time loop.

If this is right

Higher success rates on robust grasping tasks
Better resilience to disturbances during manipulation
Improved efficiency on complex dexterous sequences
Lower barrier to collecting high-quality data for precise bimanual skills

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same correction loop could be tested on single-arm or multi-fingered setups to check transfer
If the Jacobian corrections remain stable at higher speeds, they might support faster shared-autonomy modes
Combining the force profile with visual ego-centric perception could further tighten contact estimates

Load-bearing premise

Real-time contact point estimates from fingertip force data plus Jacobian updates produce accurate corrections that close the embodiment gap without adding instability or excessive delay.

What would settle it

A side-by-side trial on the same hardware where the proposed method shows equal or lower task success rates than the baselines, or where end-to-end latency visibly increases.

read the original abstract

Fine-grained, bimanual dexterous manipulation remains a foundational challenge in robotics. Traditional teleoperation systems often fail in contact-rich tasks because embodiment gaps hinder accurate kinematic mapping, while tactile and force feedback remain absent. Consequently, data collection efficiency for high-precision tasks remains prohibitively low. To address these limitations, we propose a tactile-driven adaptation strategy designed to enable fine-grained manipulation on top of teleoperation pipelines. Instantiated within our bimanual dexterous framework, DexTeleop-0, this strategy introduces a real-time optimization loop that bridges the embodiment gap by translating coarse human tracking intents into precise, force-compliant robotic commands with tactile sensing. By estimating accurate contact points and leveraging a tactile-enabled fingertip force-sensing profile, the system dynamically computes localized corrections using the operational space Jacobian with respect to joint angle updates. We rigorously evaluate this tactile-driven adaptation strategy across both simulated environments and real-world hardware. Compared with representative baselines, the proposed method consistently achieves higher task success rates and improved execution efficiency in robust grasping, disturbance-resilient manipulation, and complex dexterous tasks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DexTeleop-0 adds a tactile correction loop to bimanual teleop with hardware tests, but the abstract gives no numbers so the actual gain is hard to judge.

read the letter

The main thing to know is that the paper describes a bimanual dexterous teleoperation system called DexTeleop-0 that adds a real-time tactile adaptation loop. It estimates contact points from fingertip force profiles and uses operational space Jacobian corrections to turn coarse human inputs into force-compliant robot commands, with the goal of closing the embodiment gap for contact-rich tasks.

What the paper does well is put together ego-centric perception, force sensing, and a bimanual setup into one pipeline, then test it on the right tasks: robust grasping, disturbance-resilient manipulation, and complex dexterous work. The stress-test note is correct that they ran both simulation and real hardware experiments that directly measure task success and efficiency, so the central claim is at least set up for falsification rather than left as an untested idea.

The soft spots are mostly about missing detail. The abstract states that the method outperformed representative baselines but supplies no success rates, error bars, dataset sizes, or baseline descriptions, which makes it impossible to tell how large or reliable the improvement is. The load-bearing assumption that contact-point estimation plus Jacobian updates will stay stable and low-latency under real sensor noise is plausible on paper but not obviously probed for failure modes in the summary. No circular reasoning shows up in the argument structure.

This paper is for robotics researchers who build teleoperation systems for imitation learning or shared autonomy in dexterous manipulation. Someone in that area would get value from seeing a concrete implementation that tries to add tactile feedback to an existing pipeline. It deserves a serious referee because it has hardware validation on a recognized practical bottleneck even if the quantitative evidence needs more context to evaluate.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes DexTeleop-0, a bimanual dexterous teleoperation framework incorporating a tactile-driven adaptation strategy. This strategy estimates contact points from fingertip force-sensing profiles and applies operational-space Jacobian corrections to translate coarse human tracking into precise, force-compliant commands, aiming to bridge the embodiment gap. The work claims rigorous evaluation in both simulation and real hardware, with consistent outperformance over representative baselines in task success rates and execution efficiency for robust grasping, disturbance-resilient manipulation, and complex dexterous tasks.

Significance. If the empirical superiority in success rates and efficiency holds under proper controls and reporting, the tactile adaptation mechanism could meaningfully advance force-aware teleoperation for contact-rich dexterous tasks, supporting improved data collection for shared autonomy pipelines.

major comments (1)

[Abstract] Abstract: The central claim states that the method 'consistently achieves higher task success rates and improved execution efficiency' compared with baselines across three task categories, yet the abstract (and the manuscript as presented) supplies no quantitative results, error bars, task definitions, baseline descriptions, or dataset details. This absence renders the empirical claim unverifiable and load-bearing for the paper's contribution.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback and recommendation for major revision. We address the concern about the abstract below.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim states that the method 'consistently achieves higher task success rates and improved execution efficiency' compared with baselines across three task categories, yet the abstract (and the manuscript as presented) supplies no quantitative results, error bars, task definitions, baseline descriptions, or dataset details. This absence renders the empirical claim unverifiable and load-bearing for the paper's contribution.

Authors: We agree that the abstract should include key quantitative results to make the central empirical claim verifiable on its own. The full manuscript reports these details (success rates with error bars, task definitions, baselines, and datasets) in the experiments section. We will revise the abstract to incorporate the main quantitative findings while preserving conciseness. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The manuscript describes an empirical teleoperation framework evaluated via task success rates and efficiency metrics in simulation and hardware. No equations, derivations, fitted parameters presented as predictions, or first-principles results appear in the abstract or summary. Claims rest on direct experimental falsification rather than any self-referential reduction, self-citation chain, or ansatz smuggling. The load-bearing mechanism (tactile contact estimation plus Jacobian corrections) is tested against baselines without internal definitional equivalence.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no equations, no fitted constants, and no explicit modeling assumptions, so the ledger cannot be populated beyond noting that the central claim rests on unstated assumptions about sensor accuracy and real-time Jacobian invertibility.

pith-pipeline@v0.9.1-grok · 5741 in / 1133 out tokens · 19730 ms · 2026-06-26T08:35:33.955781+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

31 extracted references · 7 canonical work pages

[1]

GelSight: High-resolutionrobottactilesensorsforestimating geometry and force,

W.Yuan,S.Dong,andE.H.Adelson,“GelSight: High-resolutionrobottactilesensorsforestimating geometry and force,”Sensors, vol. 17, no. 12, p. 2762, 2017

2017
[2]

NeuralFeels with neural fields: Visuotactile perception for in-hand manipulation,

S. Suresh, H. Qi, T. Wu, T. Fan, L. Pineda, M. Lambeta, J. Malik, M. Kalakrishnan, R. Calandra, M. Kaess, J. Ortiz, and M. Mukadam, “NeuralFeels with neural fields: Visuotactile perception for in-hand manipulation,”Science Robotics, vol. 9, no. 96, p. eadl0628, 2024

2024
[3]

DexMV: Imitation learning for dexterous manipulation from human videos,

Y. Qin, Y.-H. Wu, S. Liu, H. Jiang, R. Yang, Y. Fu, and X. Wang, “DexMV: Imitation learning for dexterous manipulation from human videos,” inComputer Vision – ECCV 2022, vol. 13699 of Lecture Notes in Computer Science, pp. 570–587, Springer, 2022

2022
[4]

Universal manipulation interface: In-the-wild robot teaching without in-the-wild robots,

C. Chi, Z. Xu, C. Pan, E. Cousineau, B. Burchfiel, S. Feng, R. Tedrake, and S. Song, “Universal manipulation interface: In-the-wild robot teaching without in-the-wild robots,” inProceedings of Robotics: Science and Systems, (Delft, Netherlands), July 2024

2024
[5]

Dexcap: Scalable and portable mocap data collection system for dexterous manipulation,

C. S. Wang, H. Shi, W. Wang, R. Zhang, L. Fei-Fei, and K. Liu, “Dexcap: Scalable and portable mocap data collection system for dexterous manipulation,” inProceedings of Robotics: Science and Systems, (Delft, Netherlands), July 2024

2024
[6]

Gello: A general, low-cost, and intuitive teleoperation framework for robot manipulators, 2024

P. Wu, Y. Shentu, Z. Yi, X. Lin, and P. Abbeel, “GELLO: A general, low-cost, and intuitive teleoperation framework for robot manipulators,”arXiv preprint arXiv:2309.13037, 2023

work page arXiv 2023
[7]

Learning fine-grained bimanual manipulation with low-cost hardware,

T. Z. Zhao, V. Kumar, S. Levine, and C. Finn, “Learning fine-grained bimanual manipulation with low-cost hardware,” inProceedings of Robotics: Science and Systems, (Daegu, Republic of Korea), July 2023

2023
[8]

Ace: A cross-platform and visual-exoskeletons system for low-cost dexterous teleoperation,

S. Yang, M. Liu, Y. Qin, R. Ding, J. Li, X. Cheng, R. Yang, S. Yi, and X. Wang, “Ace: A cross-platform and visual-exoskeletons system for low-cost dexterous teleoperation,” inProceedings of The 8th Conference on Robot Learning, vol. 270 ofProceedings of Machine Learning Research, pp. 4895–4911, PMLR, 2025

2025
[9]

Aloha 2: An enhanced low-cost hardware for bimanual teleoperation

ALOHA 2 Team, J. Aldaco, T. Armstrong, R. Baruch, J. Bingham, S. Chan, K. Draper, D. Dwibedi, C. Finn, P. Florence, S. Goodrich, W. Gramlich, T. Hage, A. Herzog, J. Hoech, T. Nguyen, I. Storz, B. Tabanpour, L. Takayama, J. Tompson, A. Wahid, T. Wahrburg, S. Xu, S. Yaroshenko, K. Zakka, and T. Z. Zhao, “ALOHA 2: An enhanced low-cost hardware for bimanual t...

work page arXiv 2024
[10]

Using 3D mice to control robot manipulators,

V. Dhat, N. Walker, and M. Cakmak, “Using 3D mice to control robot manipulators,” inProceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction, (New York, NY, USA), pp. 896–900, Association for Computing Machinery, 2024

2024
[11]

Robotic telekinesis: Learning a robotic hand imitator by watching humans on YouTube,

A. Sivakumar, K. Shaw, and D. Pathak, “Robotic telekinesis: Learning a robotic hand imitator by watching humans on YouTube,” inProceedings of Robotics: Science and Systems, (New York City, NY, USA), June 2022

2022
[12]

From one hand to multiple hands: Imitation learning for dexterous manipulation from single-camera teleoperation,

Y. Qin, H. Su, and X. Wang, “From one hand to multiple hands: Imitation learning for dexterous manipulation from single-camera teleoperation,”IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 10873–10881, 2022

2022
[13]

Anyteleop: A general vision-based dexterous robot arm-hand teleoperation system,

Y. Qin, W. Yang, B. Huang, K. Van Wyk, H. Su, X. Wang, Y.-W. Chao, and D. Fox, “Anyteleop: A general vision-based dexterous robot arm-hand teleoperation system,” inProceedings of Robotics: Science and Systems, (Daegu, Republic of Korea), July 2023

2023
[14]

Bunny-visionpro: Real-time bimanual dexterous teleoperation for imitation learning,

R. Ding, Y. Qin, J. Zhu, C. Jia, S. Yang, R. Yang, X. Qi, and X. Wang, “Bunny-visionpro: Real-time bimanual dexterous teleoperation for imitation learning,”arXiv preprint arXiv:2407.03162, 2024. 13

work page arXiv 2024
[15]

Open-television: Teleoperation with immersive active visual feedback,

X. Cheng, J. Li, S. Yang, G. Yang, and X. Wang, “Open-television: Teleoperation with immersive active visual feedback,” inProceedings of The 8th Conference on Robot Learning(P. Agrawal, O. Kroemer, and W. Burgard, eds.), vol. 270 ofProceedings of Machine Learning Research, pp. 2729–2749, PMLR, 2025

2025
[16]

UniDexGrasp: Universal robotic dexterous grasping via learning diverse proposal generation and goal-conditioned policy,

Y. Xu, W. Wan, J. Zhang, H. Liu, Z. Shan, H. Shen, R. Wang, H. Geng, Y. Weng, J. Chen, T. Liu, L. Yi, and H. Wang, “UniDexGrasp: Universal robotic dexterous grasping via learning diverse proposal generation and goal-conditioned policy,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4737–4746, 2023

2023
[17]

Anydexgrasp: General dexterous grasping for different hands with human-level learning efficiency,

H.-S. Fang, H. Yan, Z. Tang, H. Fang, C. Wang, and C. Lu, “Anydexgrasp: General dexterous grasping for different hands with human-level learning efficiency,”arXiv preprint arXiv:2502.16420, 2025

work page arXiv 2025
[18]

RoboDexVLM: Visual language model-enabled task planning and motion control for dexterous robot manipulation,

H. Liu, S. Guo, P. Mai, J. Cao, H. Li, and J. Ma, “RoboDexVLM: Visual language model-enabled task planning and motion control for dexterous robot manipulation,” inProceedings of the IEEE/RSJ 2025 International Conference on Intelligent Robots and Systems (IROS), pp. 1–8, IEEE, 2025

2025
[19]

Dexteritygen: Foundation controller for unprecedented dexterity,

Z.-H. Yin, C. Wang, L. Pineda, F. Hogan, K. Bodduluri, A. Sharma, P. Lancaster, I. Prasad, M. Kalakrishnan, J. Malik, M. Lambeta, T. Wu, P. Abbeel, and M. Mukadam, “Dexteritygen: Foundation controller for unprecedented dexterity,”arXiv preprint arXiv:2502.04307, 2025

work page arXiv 2025
[20]

MyoDex: A generalizable prior for dexterous manipulation,

V. Caggiano, S. Dasari, and V. Kumar, “MyoDex: A generalizable prior for dexterous manipulation,” inProceedings of the 40th International Conference on Machine Learning, vol. 202 ofProceedings of Machine Learning Research, pp. 3327–3346, PMLR, 2023

2023
[21]

Cyberdemo: Augmenting simulated human demonstration for real-world dexterous manipulation,

J. Wang, Y. Qin, K. Kuang, Y. Korkmaz, A. Gurumoorthy, H. Su, and X. Wang, “Cyberdemo: Augmenting simulated human demonstration for real-world dexterous manipulation,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17952–17963, 2024

2024
[22]

DexGraspVLA: A vision-language-action framework towards general dexterous grasping,

Y.Zhong,X.Huang,R.Li,C.Zhang,Z.Chen,T.Guan,F.Zeng,K.N.Lui,Y.Ye,Y.Liang,Y.Yang, and Y. Chen, “DexGraspVLA: A vision-language-action framework towards general dexterous grasping,”inProceedingsoftheAAAIConferenceonArtificialIntelligence,vol.40,pp.18836–18844, 2026

2026
[23]

Diffusion policy: Visuomotor policy learning via action diffusion,

C. Chi, S. Feng, Y. Du, Z. Xu, E. Cousineau, B. C. M. Burchfiel, and S. Song, “Diffusion policy: Visuomotor policy learning via action diffusion,” inProceedings of Robotics: Science and Systems, (Daegu, Republic of Korea), July 2023

2023
[24]

3d diffusion policy: Generalizable visuomotor policy learning via simple 3d representations,

Y. Ze, G. Zhang, K. Zhang, C. Hu, M. Wang, and H. Xu, “3d diffusion policy: Generalizable visuomotor policy learning via simple 3d representations,” inProceedings of Robotics: Science and Systems, (Delft, Netherlands), July 2024

2024
[25]

Visual-tactile pretraining and online multitask learning for humanlike manipulation dexterity,

Q. Ye, Q. Liu, S. Wang, J. Chen, Y. Cui, K. Jin, H. Chen, X. Cai, G. Li, and J. Chen, “Visual-tactile pretraining and online multitask learning for humanlike manipulation dexterity,”Science Robotics, vol. 11, no. 110, p. eady2869, 2026

2026
[26]

Reactive diffusion policy: Slow-fast visual-tactile policy learning for contact-rich manipulation,

H. Xue, J. Ren, W. Chen, G. Zhang, Y. Fang, G. Gu, H. Xu, and C. Lu, “Reactive diffusion policy: Slow-fast visual-tactile policy learning for contact-rich manipulation,” inProceedings of Robotics: Science and Systems, 2025

2025
[27]

Tacdiffusion: Force-domain diffusion policy for precise tactile manipulation,

Y. Wu, Z. Chen, F. Wu, L. Chen, L. Zhang, Z. Bing, A. Swikir, A. Knoll, and S. Haddadin, “Tacdiffusion: Force-domain diffusion policy for precise tactile manipulation,”arXiv preprint arXiv:2409.11047, 2024. 14

work page arXiv 2024
[28]

Robot synesthesia: In-handmanipulationwithvisuotactilesensing,

Y. Yuan, H. Che, Y. Qin, B. Huang, Z.-H. Yin, K.-W. Lee, Y. Wu, S.-C. Lim, and X. Wang, “Robot synesthesia: In-handmanipulationwithvisuotactilesensing,”in2024IEEEInternationalConference on Robotics and Automation (ICRA), pp. 6558–6565, 2024

2024
[29]

Improving low-cost teleoperation: Augmenting GELLO with force,

S. Sujit, L. Nunziante, D. O. Lillrank, R. F. J. Dossa, and K. Arulkumaran, “Improving low-cost teleoperation: Augmenting GELLO with force,”arXiv preprint arXiv:2507.13602, 2025

work page arXiv 2025
[30]

DexForce: Extracting force-informed actions fromkinestheticdemonstrationsfordexterousmanipulation,

C. Chen, Z. Yu, H. Choi, M. Cutkosky, and J. Bohg, “DexForce: Extracting force-informed actions fromkinestheticdemonstrationsfordexterousmanipulation,”IEEERoboticsandAutomationLetters, vol. 10, no. 6, pp. 6416–6423, 2025

2025
[31]

Dexpilot: Vision-based teleoperation of dexterous robotic hand-arm system,

A. Handa, K. V. Wyk, W. Yang, J. Liang, Y.-W. Chao, Q. Wan, S. Birchfield, N. D. Ratliff, and D. Fox, “Dexpilot: Vision-based teleoperation of dexterous robotic hand-arm system,” in2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 9164–9170, IEEE, 2020. 15

2020

[1] [1]

GelSight: High-resolutionrobottactilesensorsforestimating geometry and force,

W.Yuan,S.Dong,andE.H.Adelson,“GelSight: High-resolutionrobottactilesensorsforestimating geometry and force,”Sensors, vol. 17, no. 12, p. 2762, 2017

2017

[2] [2]

NeuralFeels with neural fields: Visuotactile perception for in-hand manipulation,

S. Suresh, H. Qi, T. Wu, T. Fan, L. Pineda, M. Lambeta, J. Malik, M. Kalakrishnan, R. Calandra, M. Kaess, J. Ortiz, and M. Mukadam, “NeuralFeels with neural fields: Visuotactile perception for in-hand manipulation,”Science Robotics, vol. 9, no. 96, p. eadl0628, 2024

2024

[3] [3]

DexMV: Imitation learning for dexterous manipulation from human videos,

Y. Qin, Y.-H. Wu, S. Liu, H. Jiang, R. Yang, Y. Fu, and X. Wang, “DexMV: Imitation learning for dexterous manipulation from human videos,” inComputer Vision – ECCV 2022, vol. 13699 of Lecture Notes in Computer Science, pp. 570–587, Springer, 2022

2022

[4] [4]

Universal manipulation interface: In-the-wild robot teaching without in-the-wild robots,

C. Chi, Z. Xu, C. Pan, E. Cousineau, B. Burchfiel, S. Feng, R. Tedrake, and S. Song, “Universal manipulation interface: In-the-wild robot teaching without in-the-wild robots,” inProceedings of Robotics: Science and Systems, (Delft, Netherlands), July 2024

2024

[5] [5]

Dexcap: Scalable and portable mocap data collection system for dexterous manipulation,

C. S. Wang, H. Shi, W. Wang, R. Zhang, L. Fei-Fei, and K. Liu, “Dexcap: Scalable and portable mocap data collection system for dexterous manipulation,” inProceedings of Robotics: Science and Systems, (Delft, Netherlands), July 2024

2024

[6] [6]

Gello: A general, low-cost, and intuitive teleoperation framework for robot manipulators, 2024

P. Wu, Y. Shentu, Z. Yi, X. Lin, and P. Abbeel, “GELLO: A general, low-cost, and intuitive teleoperation framework for robot manipulators,”arXiv preprint arXiv:2309.13037, 2023

work page arXiv 2023

[7] [7]

Learning fine-grained bimanual manipulation with low-cost hardware,

T. Z. Zhao, V. Kumar, S. Levine, and C. Finn, “Learning fine-grained bimanual manipulation with low-cost hardware,” inProceedings of Robotics: Science and Systems, (Daegu, Republic of Korea), July 2023

2023

[8] [8]

Ace: A cross-platform and visual-exoskeletons system for low-cost dexterous teleoperation,

S. Yang, M. Liu, Y. Qin, R. Ding, J. Li, X. Cheng, R. Yang, S. Yi, and X. Wang, “Ace: A cross-platform and visual-exoskeletons system for low-cost dexterous teleoperation,” inProceedings of The 8th Conference on Robot Learning, vol. 270 ofProceedings of Machine Learning Research, pp. 4895–4911, PMLR, 2025

2025

[9] [9]

Aloha 2: An enhanced low-cost hardware for bimanual teleoperation

ALOHA 2 Team, J. Aldaco, T. Armstrong, R. Baruch, J. Bingham, S. Chan, K. Draper, D. Dwibedi, C. Finn, P. Florence, S. Goodrich, W. Gramlich, T. Hage, A. Herzog, J. Hoech, T. Nguyen, I. Storz, B. Tabanpour, L. Takayama, J. Tompson, A. Wahid, T. Wahrburg, S. Xu, S. Yaroshenko, K. Zakka, and T. Z. Zhao, “ALOHA 2: An enhanced low-cost hardware for bimanual t...

work page arXiv 2024

[10] [10]

Using 3D mice to control robot manipulators,

V. Dhat, N. Walker, and M. Cakmak, “Using 3D mice to control robot manipulators,” inProceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction, (New York, NY, USA), pp. 896–900, Association for Computing Machinery, 2024

2024

[11] [11]

Robotic telekinesis: Learning a robotic hand imitator by watching humans on YouTube,

A. Sivakumar, K. Shaw, and D. Pathak, “Robotic telekinesis: Learning a robotic hand imitator by watching humans on YouTube,” inProceedings of Robotics: Science and Systems, (New York City, NY, USA), June 2022

2022

[12] [12]

From one hand to multiple hands: Imitation learning for dexterous manipulation from single-camera teleoperation,

Y. Qin, H. Su, and X. Wang, “From one hand to multiple hands: Imitation learning for dexterous manipulation from single-camera teleoperation,”IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 10873–10881, 2022

2022

[13] [13]

Anyteleop: A general vision-based dexterous robot arm-hand teleoperation system,

Y. Qin, W. Yang, B. Huang, K. Van Wyk, H. Su, X. Wang, Y.-W. Chao, and D. Fox, “Anyteleop: A general vision-based dexterous robot arm-hand teleoperation system,” inProceedings of Robotics: Science and Systems, (Daegu, Republic of Korea), July 2023

2023

[14] [14]

Bunny-visionpro: Real-time bimanual dexterous teleoperation for imitation learning,

R. Ding, Y. Qin, J. Zhu, C. Jia, S. Yang, R. Yang, X. Qi, and X. Wang, “Bunny-visionpro: Real-time bimanual dexterous teleoperation for imitation learning,”arXiv preprint arXiv:2407.03162, 2024. 13

work page arXiv 2024

[15] [15]

Open-television: Teleoperation with immersive active visual feedback,

X. Cheng, J. Li, S. Yang, G. Yang, and X. Wang, “Open-television: Teleoperation with immersive active visual feedback,” inProceedings of The 8th Conference on Robot Learning(P. Agrawal, O. Kroemer, and W. Burgard, eds.), vol. 270 ofProceedings of Machine Learning Research, pp. 2729–2749, PMLR, 2025

2025

[16] [16]

UniDexGrasp: Universal robotic dexterous grasping via learning diverse proposal generation and goal-conditioned policy,

Y. Xu, W. Wan, J. Zhang, H. Liu, Z. Shan, H. Shen, R. Wang, H. Geng, Y. Weng, J. Chen, T. Liu, L. Yi, and H. Wang, “UniDexGrasp: Universal robotic dexterous grasping via learning diverse proposal generation and goal-conditioned policy,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4737–4746, 2023

2023

[17] [17]

Anydexgrasp: General dexterous grasping for different hands with human-level learning efficiency,

H.-S. Fang, H. Yan, Z. Tang, H. Fang, C. Wang, and C. Lu, “Anydexgrasp: General dexterous grasping for different hands with human-level learning efficiency,”arXiv preprint arXiv:2502.16420, 2025

work page arXiv 2025

[18] [18]

RoboDexVLM: Visual language model-enabled task planning and motion control for dexterous robot manipulation,

H. Liu, S. Guo, P. Mai, J. Cao, H. Li, and J. Ma, “RoboDexVLM: Visual language model-enabled task planning and motion control for dexterous robot manipulation,” inProceedings of the IEEE/RSJ 2025 International Conference on Intelligent Robots and Systems (IROS), pp. 1–8, IEEE, 2025

2025

[19] [19]

Dexteritygen: Foundation controller for unprecedented dexterity,

Z.-H. Yin, C. Wang, L. Pineda, F. Hogan, K. Bodduluri, A. Sharma, P. Lancaster, I. Prasad, M. Kalakrishnan, J. Malik, M. Lambeta, T. Wu, P. Abbeel, and M. Mukadam, “Dexteritygen: Foundation controller for unprecedented dexterity,”arXiv preprint arXiv:2502.04307, 2025

work page arXiv 2025

[20] [20]

MyoDex: A generalizable prior for dexterous manipulation,

V. Caggiano, S. Dasari, and V. Kumar, “MyoDex: A generalizable prior for dexterous manipulation,” inProceedings of the 40th International Conference on Machine Learning, vol. 202 ofProceedings of Machine Learning Research, pp. 3327–3346, PMLR, 2023

2023

[21] [21]

Cyberdemo: Augmenting simulated human demonstration for real-world dexterous manipulation,

J. Wang, Y. Qin, K. Kuang, Y. Korkmaz, A. Gurumoorthy, H. Su, and X. Wang, “Cyberdemo: Augmenting simulated human demonstration for real-world dexterous manipulation,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17952–17963, 2024

2024

[22] [22]

DexGraspVLA: A vision-language-action framework towards general dexterous grasping,

Y.Zhong,X.Huang,R.Li,C.Zhang,Z.Chen,T.Guan,F.Zeng,K.N.Lui,Y.Ye,Y.Liang,Y.Yang, and Y. Chen, “DexGraspVLA: A vision-language-action framework towards general dexterous grasping,”inProceedingsoftheAAAIConferenceonArtificialIntelligence,vol.40,pp.18836–18844, 2026

2026

[23] [23]

Diffusion policy: Visuomotor policy learning via action diffusion,

C. Chi, S. Feng, Y. Du, Z. Xu, E. Cousineau, B. C. M. Burchfiel, and S. Song, “Diffusion policy: Visuomotor policy learning via action diffusion,” inProceedings of Robotics: Science and Systems, (Daegu, Republic of Korea), July 2023

2023

[24] [24]

3d diffusion policy: Generalizable visuomotor policy learning via simple 3d representations,

Y. Ze, G. Zhang, K. Zhang, C. Hu, M. Wang, and H. Xu, “3d diffusion policy: Generalizable visuomotor policy learning via simple 3d representations,” inProceedings of Robotics: Science and Systems, (Delft, Netherlands), July 2024

2024

[25] [25]

Visual-tactile pretraining and online multitask learning for humanlike manipulation dexterity,

Q. Ye, Q. Liu, S. Wang, J. Chen, Y. Cui, K. Jin, H. Chen, X. Cai, G. Li, and J. Chen, “Visual-tactile pretraining and online multitask learning for humanlike manipulation dexterity,”Science Robotics, vol. 11, no. 110, p. eady2869, 2026

2026

[26] [26]

Reactive diffusion policy: Slow-fast visual-tactile policy learning for contact-rich manipulation,

H. Xue, J. Ren, W. Chen, G. Zhang, Y. Fang, G. Gu, H. Xu, and C. Lu, “Reactive diffusion policy: Slow-fast visual-tactile policy learning for contact-rich manipulation,” inProceedings of Robotics: Science and Systems, 2025

2025

[27] [27]

Tacdiffusion: Force-domain diffusion policy for precise tactile manipulation,

Y. Wu, Z. Chen, F. Wu, L. Chen, L. Zhang, Z. Bing, A. Swikir, A. Knoll, and S. Haddadin, “Tacdiffusion: Force-domain diffusion policy for precise tactile manipulation,”arXiv preprint arXiv:2409.11047, 2024. 14

work page arXiv 2024

[28] [28]

Robot synesthesia: In-handmanipulationwithvisuotactilesensing,

Y. Yuan, H. Che, Y. Qin, B. Huang, Z.-H. Yin, K.-W. Lee, Y. Wu, S.-C. Lim, and X. Wang, “Robot synesthesia: In-handmanipulationwithvisuotactilesensing,”in2024IEEEInternationalConference on Robotics and Automation (ICRA), pp. 6558–6565, 2024

2024

[29] [29]

Improving low-cost teleoperation: Augmenting GELLO with force,

S. Sujit, L. Nunziante, D. O. Lillrank, R. F. J. Dossa, and K. Arulkumaran, “Improving low-cost teleoperation: Augmenting GELLO with force,”arXiv preprint arXiv:2507.13602, 2025

work page arXiv 2025

[30] [30]

DexForce: Extracting force-informed actions fromkinestheticdemonstrationsfordexterousmanipulation,

C. Chen, Z. Yu, H. Choi, M. Cutkosky, and J. Bohg, “DexForce: Extracting force-informed actions fromkinestheticdemonstrationsfordexterousmanipulation,”IEEERoboticsandAutomationLetters, vol. 10, no. 6, pp. 6416–6423, 2025

2025

[31] [31]

Dexpilot: Vision-based teleoperation of dexterous robotic hand-arm system,

A. Handa, K. V. Wyk, W. Yang, J. Liang, Y.-W. Chao, Q. Wan, S. Birchfield, N. D. Ratliff, and D. Fox, “Dexpilot: Vision-based teleoperation of dexterous robotic hand-arm system,” in2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 9164–9170, IEEE, 2020. 15

2020