Recognition: 2 theorem links
· Lean TheoremTouchAnything: A Dataset and Framework for Bimanual Tactile Estimation from Egocentric Video
Pith reviewed 2026-05-14 18:58 UTC · model grok-4.3
The pith
Tactile pressure maps can be predicted from egocentric video of bimanual interactions by incorporating optional wrist views.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that continuous pressure maps for bimanual hand-object interactions can be inferred from egocentric video input, and that a multi-view vision-to-touch model trained on the EgoTouch dataset achieves up to 5.0 percent relative improvement in Contact IoU and 6.1 percent relative improvement in Volumetric IoU when wrist-mounted views are included alongside the primary egocentric view.
What carries the argument
TouchAnything, a baseline multi-view vision-to-touch prediction framework that treats the egocentric view as the main input and incorporates available wrist views to refine tactile estimates.
If this is right
- Large egocentric video collections can receive inferred tactile labels to support physically grounded pretraining of interaction models.
- Embodied agents can acquire contact-aware representations without deploying tactile hardware on every robot or recording rig.
- Bimanual manipulation policies can be trained with more realistic estimates of force and pressure distribution.
- Dataset scaling for tactile research becomes feasible across diverse real-world environments using only cameras.
Where Pith is reading between the lines
- Pre-existing egocentric video archives could be retroactively annotated with tactile predictions to bootstrap new training runs.
- Wearable systems using only head and wrist cameras might deliver approximate touch feedback for augmented or virtual reality interfaces.
- The prediction approach could be extended to estimate additional contact properties such as friction or slip once paired with suitable auxiliary signals.
Load-bearing premise
The wearable tactile sensors supply accurate, dense, and time-synchronized ground-truth pressure maps that can serve as reliable supervision for training vision-based prediction models.
What would settle it
Apply the trained model to a new set of manipulation episodes recorded with the same sensor suite but featuring unseen objects or force profiles, then check whether the predicted contact regions and pressure values diverge substantially from the actual sensor readings in location, area, or magnitude.
read the original abstract
Egocentric human video data, which captures rich human-environment interactions and can be collected at scale, has become a key driver of embodied intelligence research. However, existing egocentric datasets typically lack tactile sensing, a critical modality that provides direct cues about contact, force, and pressure in human-object interaction. Without such signals, models struggle to learn physically grounded representations of real-world interaction dynamics. While tactile sensors provide these cues, deploying high-quality tactile hardware at scale remains expensive and cumbersome. This raises a central question: can tactile feedback be inferred directly from visual observations, enabling scalable tactile supervision for egocentric video data and supporting physically grounded embodied learning? To enable research in this direction, we introduce EgoTouch, a large-scale multi-view egocentric dataset with dense tactile supervision for bimanual hand-object interaction. EgoTouch comprises 208 manipulation tasks spanning 1,891 episodes in diverse indoor and outdoor environments, with synchronized multi-view RGB (head-mounted egocentric and dual wrist-mounted cameras), bimanual 3D hand pose, and continuous pressure maps from wearable tactile sensors. Building on EgoTouch, we introduce TouchAnything, a baseline multi-view vision-to-touch prediction framework that uses the egocentric view as the primary input and flexibly leverages available wrist-mounted views at inference time. Experiments show that incorporating wrist-mounted views generally improves tactile prediction over egocentric-only input, achieving up to 5.0% relative improvement in Contact IoU and 6.1% relative improvement in Volumetric IoU. We will publicly release the dataset, code, and benchmark.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the EgoTouch dataset comprising 208 bimanual manipulation tasks and 1,891 episodes with synchronized head-mounted egocentric RGB, dual wrist-mounted RGB, 3D hand poses, and dense pressure maps from wearable tactile sensors. It also presents the TouchAnything multi-view vision-to-touch prediction framework that takes egocentric video as primary input and optionally incorporates wrist views at inference time, reporting relative gains of up to 5.0% in Contact IoU and 6.1% in Volumetric IoU when wrist views are added.
Significance. If the tactile ground-truth signals prove reliable, the dataset and benchmark would enable scalable vision-based tactile supervision for egocentric video, supporting physically grounded embodied learning without requiring tactile hardware at test time. The public release of data, code, and benchmark is a clear strength for reproducibility.
major comments (3)
- [Dataset] Dataset section: the manuscript provides no description of sensor calibration against known forces, drift correction, contact-area validation, or temporal synchronization checks between RGB frames and pressure readings across the 1,891 episodes. This is load-bearing because the reported IoU improvements are measured against these pressure maps as supervision.
- [Experiments] Experiments section: the headline relative improvements (5.0% Contact IoU, 6.1% Volumetric IoU) are presented without error bars, statistical significance tests, explicit train/validation/test splits, or comparison against strong single-view and multi-view baselines, preventing assessment of whether the wrist-view gains are robust.
- [Method] Model description: the TouchAnything architecture does not specify the exact mechanism for flexibly incorporating wrist views only at inference (e.g., whether the network is trained with all views or uses late fusion), which is central to the claim of flexible multi-view prediction.
minor comments (2)
- [Abstract] Abstract: the phrase 'up to' for the maximum improvements does not indicate the specific tasks or conditions under which these values are achieved.
- [Evaluation Metrics] Notation: the definitions of Contact IoU and Volumetric IoU are not restated in the main text, forcing readers to consult supplementary material for metric details.
Simulated Author's Rebuttal
We sincerely thank the referee for the constructive and detailed feedback. We address each major comment below and will revise the manuscript to incorporate the suggested improvements for greater clarity and rigor.
read point-by-point responses
-
Referee: [Dataset] Dataset section: the manuscript provides no description of sensor calibration against known forces, drift correction, contact-area validation, or temporal synchronization checks between RGB frames and pressure readings across the 1,891 episodes. This is load-bearing because the reported IoU improvements are measured against these pressure maps as supervision.
Authors: We agree that a detailed account of sensor calibration and synchronization is necessary to substantiate the reliability of the tactile supervision. In the revised manuscript, we will add a new subsection in the Dataset section describing: (1) calibration of the wearable tactile sensors against known forces using a reference force gauge, (2) the drift correction procedure applied to pressure readings, (3) contact-area validation performed by cross-referencing pressure maps with visual contact annotations on a held-out subset of episodes, and (4) the temporal synchronization protocol employing hardware triggers and post-hoc timestamp alignment between the RGB streams and pressure data. These additions will directly address the concern regarding the quality of the ground-truth signals. revision: yes
-
Referee: [Experiments] Experiments section: the headline relative improvements (5.0% Contact IoU, 6.1% Volumetric IoU) are presented without error bars, statistical significance tests, explicit train/validation/test splits, or comparison against strong single-view and multi-view baselines, preventing assessment of whether the wrist-view gains are robust.
Authors: We concur that additional statistical rigor and baseline comparisons are required to properly evaluate the robustness of the wrist-view gains. In the revision we will: report error bars as standard deviations computed over five independent training runs with different random seeds; include paired t-test results with p-values to establish statistical significance of the improvements; explicitly document the train/validation/test episode splits (70/15/15 ratio, stratified by task to prevent leakage); and expand the baselines to include a strong single-view egocentric model, a fully multi-view model trained and tested with all views, and an additional late-fusion variant. These changes will allow readers to assess the reliability of the reported relative gains. revision: yes
-
Referee: [Method] Model description: the TouchAnything architecture does not specify the exact mechanism for flexibly incorporating wrist views only at inference (e.g., whether the network is trained with all views or uses late fusion), which is central to the claim of flexible multi-view prediction.
Authors: We thank the referee for highlighting this ambiguity. The TouchAnything model is trained end-to-end with all available views (egocentric plus wrist) using a cross-attention fusion module; at inference, wrist views are optionally omitted by zero-masking the corresponding input tokens to the fusion layer, which enables flexible single- or multi-view operation without retraining. We will revise the Method section to explicitly describe this training/inference procedure, add pseudocode for the masking mechanism, and include a diagram clarifying the view-conditional forward pass. revision: yes
Circularity Check
No circularity; purely empirical dataset and baseline
full rationale
The paper presents a new multi-view egocentric dataset (EgoTouch) with synchronized RGB, 3D hand pose, and wearable pressure maps across 1,891 episodes, plus a baseline multi-view vision-to-touch prediction model (TouchAnything). All reported results are empirical performance numbers (Contact IoU, Volumetric IoU) measured on held-out episodes from the collected data; no equations, fitted parameters, or first-principles derivations are claimed, and no self-citation chain is used to justify uniqueness or force a result. The contribution is therefore self-contained against external benchmarks and does not reduce any prediction to its own inputs by construction.
Axiom & Free-Parameter Ledger
free parameters (1)
- neural network hyperparameters
axioms (1)
- domain assumption Wearable tactile sensors provide accurate and dense ground-truth pressure maps synchronized with video.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclearTouchAnything, a baseline multi-view vision-to-touch prediction framework that uses the egocentric view as the primary input and flexibly leverages available wrist-mounted views at inference time... L = λ_mse L_MSE + λ_l1 L_L1 + λ_tv L_TV
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclearEgoTouch comprises 208 manipulation tasks spanning 1,891 episodes... continuous pressure maps from wearable tactile sensors
Reference graph
Works this paper leans on
-
[1]
Amrit Aryal, Santosh Giri, Sanjeeb Prasad Panday, Suman Sharma, Babu R. Dawadi, and Sushant Chalise. Efficient 3d scene reconstruction from multi-view RGB images using optimized gaussian splatting.IEEE Access, 14: 1269–1286, 2026. doi: 10.1109/ACCESS.2025.3648171. URLhttps://doi.org/10.1109/ACCESS.2025.3648171
-
[3]
ContactDB: Analyzing and Predicting Grasp Contact via Thermal Imaging
Samarth Brahmbhatt, Cusuh Ham, Charles C. Kemp, and James Hays. Contactdb: Analyzing and predicting grasp contact via thermal imaging, 2019. URLhttps://arxiv.org/abs/1904.06830
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[4]
Narang, Karl Van Wyk, Umar Iqbal, Stan Birchfield, Jan Kautz, and Dieter Fox
Yu-Wei Chao, Wei Yang, Yu Xiang, Pavlo Molchanov, Ankur Handa, Jonathan Tremblay, Yashraj S. Narang, Karl Van Wyk, Umar Iqbal, Stan Birchfield, Jan Kautz, and Dieter Fox. Dexycb: A benchmark for capturing hand grasping of objects. InIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021
2021
-
[5]
Scaling egocentric vision: The epic-kitchens dataset
Dima Damen, Hazel Doughty, Giovanni Maria Farinella, et al. Scaling egocentric vision: The epic-kitchens dataset. In European Conference on Computer Vision (ECCV), 2018
2018
-
[6]
Actionsense: A multimodal dataset and recording framework for human activities using wearable sensors in a kitchen environment
Joseph Del Preore and Daniela Rus. Actionsense: A multimodal dataset and recording framework for human activities using wearable sensors in a kitchen environment. InAdvancesin Neural Information Processing Systems (NeurIPS), 2022
2022
-
[7]
Black, and Otmar Hilliges
Zicong Fan, Omid Taheri, Dimitrios Tzionas, Muhammed Kocabas, Manuel Kaufmann, Michael J. Black, and Otmar Hilliges. Arctic: A dataset for dexterous bimanual hand-object manipulation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
2023
-
[8]
Shenyuan Gao, William Liang, Kaiyuan Zheng, Ayaan Malik, Seonghyeon Ye, Sihyun Yu, Wei-Cheng Tseng, Yuzhu Dong, Kaichun Mo, Chen-Hsuan Lin, Qianli Ma, Seungjun Nah, Loic Magne, Jiannan Xiang, Yuqi Xie, Ruijie Zheng, Dantong Niu, You Liang Tan, K. R. Zentner, George Kurian, Suneel Indupuru, Pooya Jannaty, Jinwei Gu, Jun Zhang, Jitendra Malik, Pieter Abbeel...
-
[9]
Egopressure: A dataset for hand pressure and pose estimation in egocentric views.arXiv preprint, 2024
Patrick Grady et al. Egopressure: A dataset for hand pressure and pose estimation in egocentric views.arXiv preprint, 2024
2024
-
[10]
Ego4d: Around the world in 3,000 hours of egocentric video
Kristen Grauman, Andrew Westbury, et al. Ego4d: Around the world in 3,000 hours of egocentric video. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
2022
-
[11]
Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zachary Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, María Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md M...
-
[12]
Ego-exo4d: Understanding skilled human activity from first-and third-person perspectives
Kristen Grauman et al. Ego-exo4d: Understanding skilled human activity from first-and third-person perspectives. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
2024
-
[13]
Keypoint transformer: Solving joint identification in challenging hands and object interactions for accurate 3d pose estimation
Shreyas Hampali, Sayan Deb Sarkar, Mahdi Rad, and Vincent Lepetit. Keypoint transformer: Solving joint identification in challenging hands and object interactions for accurate 3d pose estimation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11090–11100, June 2022
2022
-
[14]
EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video,
Ryan Hoque, Peide Huang, David J. Yoon, Mouli Sivapurapu, and Jian Zhang. Egodex: Learning dexterous manipulation from large-scale egocentric video, 2026. URLhttps://arxiv.org/abs/2505.11709
-
[15]
Hand Pose Estimation via Latent 2.5D Heatmap Regression
Umar Iqbal, Pavlo Molchanov, Thomas Breuel, Juergen Gall, and Jan Kautz. Hand pose estimation via latent 2.5 d heatmap regression.arXiv preprint arXiv:1804.09534, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[16]
Digit: A novel design for a low-cost compact high-resolution tactile sensor with application to in-hand manipulation
Mike Lambeta, Po-Wei Chou, Stephen Tian, et al. Digit: A novel design for a low-cost compact high-resolution tactile sensor with application to in-hand manipulation. InIEEE Robotics and Automation Letters, 2020
2020
-
[17]
Ego-1k - A large-scale multiview video dataset for egocentric vision.CoRR, abs/2603.13741, 2026
Jae Yong Lee, Daniel Scharstein, Akash Bapat, Hao Hu, Andrew Fu, Haoru Zhao, Paul Sammut, Xiang Li, Stephen Jeapes, Anik Gupta, Lior David, Saketh Madhuvarasu, Jay Girish Joshi, and Jason Wither. Ego-1k - A large-scale multiview video dataset for egocentric vision.CoRR, abs/2603.13741, 2026. doi: 10.48550/ARXIV.2603.13741. URLhttps://doi.org/10.48550/arXi...
-
[18]
Connecting touch and vision via cross-modal prediction
Yunzhu Li, Jun-Yan Li, Antonio Torralba, et al. Connecting touch and vision via cross-modal prediction. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019
2019
-
[19]
Yijiong Lin, Mauro Comi, Alex Church, Dandan Zhang, and Nathan F. Lepora. Attention for robot touch: Tactile saliency prediction for robust sim-to-real tactile control. In IROS, pages 10806–10812, 2023. doi: 10.1109/IROS55552.2023.10341888. URLhttps://doi.org/10.1109/IROS55552.2023.10341888
-
[20]
Hoi4d: A 4d egocentric dataset for category-level human-object interaction
Yunze Liu, Yun Liu, Che Jiang, et al. Hoi4d: A 4d egocentric dataset for category-level human-object interaction. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
2022
-
[21]
Being-h0.5: Scaling human-centric robot learning for cross-embodiment generalization
Hao Luo, Ye Wang, Wanpeng Zhang, Sipeng Zheng, Ziheng Xi, Chaoyi Xu, Haiweng Xu, Haoqi Yuan, Chi Zhang, Yiqing Wang, Yicheng Feng, and Zongqing Lu. Being-h0.5: Scaling human-centric robot learning for cross-embodiment generalization. CoRR, abs/2601.12993, 2026. doi: 10.48550/ARXIV.2601.12993. URL https://doi.org/10.48550/arXiv.2601.12993
-
[22]
Dinov2: Learning robust visual features without supervision
Maxime Oquab, Timothée Darcet, Théo Moutakanni, et al. Dinov2: Learning robust visual features without supervision. Transactionson Machine Learning Research, 2023
2023
-
[23]
Toby Perrett, Ahmad Darkhalil, Saptarshi Sinha, Omar Emara, Sam Pollard, Kranti Kumar Parida, Kaiting Liu, Prajwal Gatti, Siddhant Bansal, Kevin Flanagan, Jacob Chalk, Zhifan Zhu, Rhodri Guerrier, Fahd Abdelazim, Bin Zhu, Davide Moltisanti, Michael Wray, Hazel Doughty, and Dima Damen. HD-EPIC: A highly-detailed egocentric video dataset. InIEEE/CVF Confere...
-
[24]
Egocentric hand activity video dataset and bidirectional motion-priors for hand action recognition.IEEEAccess, 14:8128–8148,
Jiyoung Seo, Dong In Lee, Pilhyeon Lee, Jiwoo Lee, Youn-Hee Gil, Karthik Ramani, and Sangpil Kim. Egocentric hand activity video dataset and bidirectional motion-priors for hand action recognition.IEEEAccess, 14:8128–8148,
-
[25]
URLhttps://doi.org/10.1109/ACCESS.2026.3652803
doi: 10.1109/ACCESS.2026.3652803. URLhttps://doi.org/10.1109/ACCESS.2026.3652803
-
[26]
Opentouch: Bringing full-hand touch to real-world interaction, 2025
Yuxin Ray Song, Jinzhou Li, Rao Fu, Devin Murphy, Kaichen Zhou, Rishi Shiv, Yaqi Li, Haoyu Xiong, Crys- tal Elaine Owens, Yilun Du, Yiyue Luo, Xianyi Cheng, Antonio Torralba, Wojciech Matusik, and Paul Pu Liang. Opentouch: Bringing full-hand touch to real-world interaction, 2025. URLhttps://arxiv.org/abs/2512.16842
-
[27]
Haisheng Su, Feixiang Song, Cong Ma, Wei Wu, and Junchi Yan. Robosense: Large-scale dataset and benchmark for egocentric robot perception and navigation in crowded and unstructured environments. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025, pages 27446–27455. Computer Vision Foundation ...
-
[28]
A survey of robot manipulation in contact.Robotics Auton
Markku Suomalainen, Yiannis Karayiannidis, and Ville Kyrki. A survey of robot manipulation in contact.Robotics Auton. Syst., 156:104224, 2022. doi: 10.1016/J.ROBOT.2022.104224. URLhttps://doi.org/10.1016/j.robot. 2022.104224. 12
-
[29]
Black, and Dimitrios Tzionas
Omid Taheri, Nima Ghorbani, Michael J. Black, and Dimitrios Tzionas. Grab: A dataset of whole-body human grasping of objects. InEuropean Conference on Computer Vision (ECCV), 2020
2020
-
[30]
Attention is all you need
Ashish Vaswani, Noam Shazeer, Niki Parmar, et al. Attention is all you need. InAdvancesin Neural Information Processing Systems (NeurIPS), 2017
2017
-
[31]
Akihiko Yamaguchi and Christopher G. Atkeson. Recent progress in tactile sensing and sensors for robotic manipulation: can we turn tactile sensing into vision? Adv. Robotics, 33(14):661–673, 2019. doi: 10.1080/ 01691864.2019.1632222. URLhttps://doi.org/10.1080/01691864.2019.1632222
-
[32]
Touch and go: Learning from human-collected vision and touch
Fengyu Yang, Chenyang Ma, Jiacheng Zhang, Jing Zhu, Wenzhen Yuan, and Andrew Owens. Touch and go: Learning from human-collected vision and touch. In Sanmi Koyejo, S. Mohamed, A. Agar- wal, Danielle Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, Ne...
2022
-
[33]
Linhan Yang, Bidan Huang, Qingbiao Li, Ya-Yen Tsai, Wang Wei Lee, Chaoyang Song, and Jia Pan. Tacgnn: Learning tactile-based in-hand manipulation with a blind robot using hierarchical graph neural network.IEEE Robotics Autom. Lett., 8(6):3605–3612, 2023. doi: 10.1109/LRA.2023.3264759. URLhttps://doi.org/10.1109/ LRA.2023.3264759
-
[34]
Oakink: A large-scale knowledge repository for understanding hand-object interaction
Lixin Yang, Kailin Li, Xinyu Zhan, et al. Oakink: A large-scale knowledge repository for understanding hand-object interaction. InIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
2022
-
[35]
Pressurevision: Estimating hand pressure from a single rgb image
Patrick Grady Yang, Christian Haase-Schütz, Marcel Leonardi, et al. Pressurevision: Estimating hand pressure from a single rgb image. InEuropean Conference on Computer Vision (ECCV), 2022
2022
-
[36]
Ruihan Yang, Qinxi Yu, Yecheng Wu, Rui Yan, Borui Li, An-Chieh Cheng, Xueyan Zou, Yunhao Fang, Xuxin Cheng, Ri-Zhao Qiu, Hongxu Yin, Sifei Liu, Song Han, Yao Lu, and Xiaolong Wang. Egovla: Learning vision- language-action models from egocentric human videos.CoRR, abs/2507.12440, 2025. doi: 10.48550/ARXIV.2507. 12440. URLhttps://doi.org/10.48550/arXiv.2507.12440
-
[37]
Taegyoon Yoon, Yegyu Han, Seojin Ji, Jaewoo Park, Sojeong Kim, Taein Kwon, and Hyung-Sin Kim. Egoxtreme: A dataset for robust object pose estimation in egocentric views under extreme conditions.CoRR, abs/2603.25135,
-
[38]
doi: 10.48550/ARXIV.2603.25135. URLhttps://doi.org/10.48550/arXiv.2603.25135
-
[39]
Wenzhen Yuan, Siyuan Dong, and Edward H. Adelson. Gelsight: High-resolution robot tactile sensors for estimating geometry and force.Sensors, 17(12), 2017
2017
-
[40]
Gu Zhang, Qicheng Xu, Haozhe Zhang, Jianhan Ma, Long He, Yiming Bao, Zeyu Ping, Zhecheng Yuan, Chenhao Lu, Chengbo Yuan, Tianhai Liang, Xiaoyu Tian, Maanping Shao, Feihong Zhang, Mingyu Ding, Yang Gao, Hao Zhao, Hang Zhao, and Huazhe Xu. Unidex: A robot foundation suite for universal dexterous hand control from egocentric human videos.CoRR, abs/2603.22264...
-
[41]
EgoScale: Scaling Dexterous Manipulation with Diverse Ego- centric Human Data,
Ruijie Zheng, Dantong Niu, Yuqi Xie, Jing Wang, Mengda Xu, Yunfan Jiang, Fernando Castañeda, Fengyuan Hu, You Liang Tan, Letian Fu, Trevor Darrell, Furong Huang, Yuke Zhu, Danfei Xu, and Linxi Fan. Egoscale: Scaling dexterous manipulation with diverse egocentric human data, 2026. URLhttps://arxiv.org/abs/2602.16710
-
[42]
Touching a nerf: Leveraging neural radiance fields for tactile sensory data generation
Shaohong Zhong et al. Touching a nerf: Leveraging neural radiance fields for tactile sensory data generation. arXiv preprint arXiv:2304.12828, 2023
-
[43]
Christian Zimmermann, Duygu Ceylan, Jimei Yang, Bryan C. Russell, Max J. Argus, and Thomas Brox. Freihand: A dataset for markerless capture of hand pose and shape from single RGB images. In2019 IEEE/CVF InternationalConference on Computer Vision, ICCV 2019, Seoul, Korea(South), October 27 - November2, 2019, pages 813–822. IEEE, 2019. doi: 10.1109/ICCV.201...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.