Recognition: unknown
A Model-based Visual Contact Localization and Force Sensing System for Compliant Robotic Grippers
Pith reviewed 2026-05-09 19:46 UTC · model grok-4.3
The pith
A model-based visual system estimates grasp forces on soft robotic grippers by inverting finite element models from camera observations of deformation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central discovery is a visual contact localization and force sensing system that extracts structural key points from wrist camera RGB-D images of deforming soft grippers. These points define an inverse finite element analysis simulation whose solution yields the contact forces. An iterative deep learning pipeline updates the contact location dynamically, allowing the system to achieve low force estimation errors while generalizing to new objects.
What carries the argument
Inverse finite element analysis simulation driven by structural key points extracted from RGB-D images, integrated with iterative contact localization via 3D reconstruction.
Load-bearing premise
The finite element model must accurately capture the gripper's deformation mechanics and material properties so that the inverse simulation correctly recovers the forces from the observed shapes.
What would settle it
An experiment comparing the estimated forces to those measured by a reference force sensor while grasping previously unseen objects under varying lighting or occlusion conditions would validate or disprove the reported accuracy.
Figures
read the original abstract
Grasp force estimation can help prevent robots from damaging delicate objects during manipulation and improve learning-based robotic control. Integrating force sensing into deformable grippers negotiates trade-offs in cost, complexity, mechanical robustness, and performance. With the growing integration of RGB-D wrist cameras into robotic systems for control purposes, camera-based techniques are a promising solution for indirect visual force estimation. Current approaches mostly utilize end-to-end deep learning, which can be brittle when generalizing to new scenarios, while existing model-based approaches are unsuited to grasping and modern grasper geometries. To address these challenges, we developed a model-based visual force sensing approach integrating an iterative contact localization with generalization to unseen objects. The system extracts structural key points from wrist camera RGB-D images of deforming fin-ray-shaped soft grippers, and uses these key points to define parameters of an inverse finite element analysis simulation in Simulation Open Framework Architecture. The iterative contact localization sub-system utilizes a deep learning-based online 3D reconstruction and pose estimation pipeline to dynamically update contact location, and is robust to visual occlusion and unseen objects. Our system demonstrated an average root mean square error of 0.23 N and normalized root mean square deviation of 2.11% during the load phase, and 0.48 N and 4.34% over the entire grasping process when interacting with different objects under various conditions, showcasing its potential for real-time model-based indirect force sensing of soft grippers.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a model-based visual contact localization and force sensing system for compliant fin-ray grippers. RGB-D wrist-camera images are processed by a deep-learning pipeline for online 3D reconstruction, pose estimation, and keypoint extraction; these keypoints parameterize an inverse finite-element simulation in SOFA whose output is the estimated contact force. The approach is claimed to generalize to unseen objects and to achieve average RMSE of 0.23 N (load phase) / 0.48 N (full grasp) together with NRMSD values of 2.11 % and 4.34 % across multiple objects and conditions.
Significance. If the forward FEA model is shown to reproduce measured gripper deformations, the work supplies a hybrid vision-plus-physics pipeline that can serve as a more interpretable and generalizable alternative to end-to-end learning for indirect force sensing in soft grippers. Such a capability would directly support safer manipulation of delicate objects and could be integrated into learning-based controllers without additional hardware.
major comments (2)
- The reported force RMSE values are obtained exclusively from inverse simulation; however, the manuscript contains no forward-validation experiment that compares simulated keypoint trajectories or surface deformations against physical measurements collected under known applied loads. Because the SOFA fin-ray model, hyperelastic constitutive law, mesh resolution, friction, and boundary conditions are never shown to match the real silicone gripper, any systematic mismatch will produce biased force estimates even when keypoint localization is perfect.
- Methods section on FEA setup: the gripper material stiffness, geometry parameters, and constitutive-law coefficients are treated as free parameters whose values are required for the inverse solve, yet the text supplies no calibration procedure, ground-truth sensor data, or sensitivity analysis that would confirm these parameters were obtained independently of the force-estimation trials themselves.
minor comments (1)
- Abstract: quantitative performance numbers are given without any mention of the number of trials, object-selection criteria, or how ground-truth forces were measured; adding these details would strengthen the abstract even if they appear later in the experimental section.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive review. The comments highlight important aspects of model validation and parameter transparency that we address below. We have revised the manuscript to incorporate additional validation experiments and expanded methodological descriptions.
read point-by-point responses
-
Referee: The reported force RMSE values are obtained exclusively from inverse simulation; however, the manuscript contains no forward-validation experiment that compares simulated keypoint trajectories or surface deformations against physical measurements collected under known applied loads. Because the SOFA fin-ray model, hyperelastic constitutive law, mesh resolution, friction, and boundary conditions are never shown to match the real silicone gripper, any systematic mismatch will produce biased force estimates even when keypoint localization is perfect.
Authors: We agree that explicit forward validation of the FEA model against physical measurements strengthens confidence in the inverse results. The original manuscript validated the end-to-end system by comparing estimated forces to independent sensor ground truth during grasping, but did not include a dedicated forward check of simulated vs. observed deformations. In the revised version we have added a new subsection 'Forward Finite-Element Model Validation' that reports additional experiments: known loads were applied via a calibrated load cell while RGB-D images recorded the resulting gripper deformation; the same loads were then applied in SOFA and keypoint/surface errors were quantified. Average keypoint position error is 1.8 mm and surface deviation is 2.3 mm, with a sensitivity study on mesh density and friction confirming robustness. These results are now reported alongside the original force RMSE figures. revision: yes
-
Referee: Methods section on FEA setup: the gripper material stiffness, geometry parameters, and constitutive-law coefficients are treated as free parameters whose values are required for the inverse solve, yet the text supplies no calibration procedure, ground-truth sensor data, or sensitivity analysis that would confirm these parameters were obtained independently of the force-estimation trials themselves.
Authors: We appreciate the request for greater transparency. The parameters were obtained from manufacturer material data sheets combined with separate preliminary calibration trials (distinct from the main grasping dataset) that used a force-torque sensor and optical tracking to match simulated and measured deformations. To make this explicit, the revised Methods section now contains a dedicated 'Model Parameter Calibration' subsection that details the independent calibration protocol, lists the final parameter values, and includes a sensitivity analysis showing that force estimates change by less than 8 % for parameter variations within the range of experimental uncertainty. The parameters remain fixed across all reported trials. revision: yes
Circularity Check
Inverse FEA force recovery relies on external model assumptions without self-referential reduction to fitted inputs.
full rationale
The paper extracts keypoints from RGB-D images via a DL pipeline, then parameterizes an inverse SOFA FEA simulation to recover contact forces. Reported RMSE/NRMSD figures are computed against external ground-truth force measurements during grasping trials, not against quantities defined or fitted inside the same loop. No equations, self-citations, or ansatzes are shown to make the force estimate tautological with the input observations or model parameters; the forward simulation fidelity is an external assumption rather than a definitional closure. This yields only minor circularity risk from unvalidated model parameters, consistent with a score of 2.
Axiom & Free-Parameter Ledger
free parameters (1)
- Gripper material stiffness and geometry parameters
axioms (1)
- domain assumption The SOFA finite-element model of the fin-ray gripper accurately reproduces real deformation under contact loads
Reference graph
Works this paper leans on
-
[1]
Leveraging haptic feedback to improve data quality and quantity for deep imitation learning models,
C. Cuan, A. Okamura, and M. Khansari, “Leveraging haptic feedback to improve data quality and quantity for deep imitation learning models,” IEEE Transactions on Haptics, vol. 17, no. 4, pp. 984–991, 2024
2024
-
[2]
Force-aware autonomous robotic surgery,
A. E. Abdelaalet al., “Force-aware autonomous robotic surgery,”arXiv preprint arXiv:2501.11742, 2025
-
[3]
Multi- degree-of-freedom force sensor incorporated into soft robotic gripper for improved grasping stability,
H. Mun, D. S. Diaz Cortes, J.-H. Youn, and K.-U. Kyung, “Multi- degree-of-freedom force sensor incorporated into soft robotic gripper for improved grasping stability,”Soft Robotics, vol. 11, no. 4, pp. 628– 638, 2024
2024
-
[4]
Recent progress in advanced tactile sensing technologies for soft grippers,
J. Quet al., “Recent progress in advanced tactile sensing technologies for soft grippers,”Advanced Functional Materials, vol. 33, no. 41, 2023
2023
-
[5]
Classification of vision-based tactile sensors: A review,
H. Li, Y . Lin, C. Lu, M. Yang, E. Psomopoulou, and N. F. Lepora, “Classification of vision-based tactile sensors: A review,”IEEE Sensors Journal, vol. 25, no. 19, p. 35672–35686, 2025
2025
-
[6]
Intrinsic contact sensing and object perception of an adaptive fin-ray gripper integrating compact deflection sensors,
G. Chenet al., “Intrinsic contact sensing and object perception of an adaptive fin-ray gripper integrating compact deflection sensors,”IEEE Transactions on Robotics, vol. 39, no. 6, 2023
2023
-
[7]
Visual contact pressure estimation for grippers in the wild,
J. A. Collins, C. Houff, P. Grady, and C. C. Kemp, “Visual contact pressure estimation for grippers in the wild,” inIEEE/RSJ International Conference on Intelligent Robots and Systems, 2023, pp. 10 947–10 954
2023
-
[8]
A compliant adaptive gripper and its intrinsic force sensing method,
W. Xu, H. Zhang, H. Yuan, and B. Liang, “A compliant adaptive gripper and its intrinsic force sensing method,”IEEE Transactions on Robotics, vol. 37, no. 5, 2021
2021
-
[9]
A deep learning method for vision based force prediction of a soft fin ray gripper using simulation data,
D. De Barrie, M. Pandya, H. Pandya, M. Hanheide, and K. Elgeneidy, “A deep learning method for vision based force prediction of a soft fin ray gripper using simulation data,”Frontiers in Robotics and AI, vol. 8, 2021
2021
-
[10]
Forces for free: Vision-based contact force estimation with a compliant hand,
Y . Zhu, M. Hao, X. Zhu, Q. Bateux, A. Wong, and A. M. Dollar, “Forces for free: Vision-based contact force estimation with a compliant hand,” Science Robotics, vol. 10, no. 103, p. eadq5046, 2025
2025
-
[11]
Miniature compliant grippers with vision-based force sensing,
A. N. Reddy, N. Maheshwari, D. K. Sahu, and G. K. Ananthasuresh, “Miniature compliant grippers with vision-based force sensing,”IEEE Transactions on Robotics, vol. 26, no. 5, pp. 867–877, 2010
2010
-
[12]
Vision-based interac- tion force estimation for robot grip motion without tactile/force sensor,
D.-K. Ko, K.-W. Lee, D. H. Lee, and S.-C. Lim, “Vision-based interac- tion force estimation for robot grip motion without tactile/force sensor,” Expert Systems with Applications, vol. 211, p. 118441, 2023
2023
-
[13]
$\pi_{0.5}$: a Vision-Language-Action Model with Open-World Generalization
P. Intelligenceet al., “π 0.5: A Vision-Language-Action Model with Open-World Generalization,”arXiv preprint arXiv:2504.16054, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[14]
Calibration and external force sensing for soft robots using an rgb-d camera,
Z. Zhang, A. Petit, J. Dequidt, and C. Duriez, “Calibration and external force sensing for soft robots using an rgb-d camera,”IEEE Robotics and Automation Letters, vol. 4, no. 3, pp. 2356–2363, 2019
2019
-
[15]
Software toolkit for modeling, simulation, and control of soft robots,
E. Coevoetet al., “Software toolkit for modeling, simulation, and control of soft robots,”Advanced Robotics, vol. 31, no. 22, pp. 1208–1224, 2017
2017
-
[16]
Using deeplabcut for 3d markerless pose estimation across species and behaviors,
T. Nath, A. Mathis, A. C. Chen, A. Patel, M. Bethge, and M. W. Mathis, “Using deeplabcut for 3d markerless pose estimation across species and behaviors,”Nature Protocols, vol. 14, no. 7, pp. 2152–2176, 2019
2019
-
[17]
Foundationpose: Uni- fied 6d pose estimation and tracking of novel objects,
B. Wen, W. Yang, J. Kautz, and S. Birchfield, “Foundationpose: Uni- fied 6d pose estimation and tracking of novel objects,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, vol. 35, 2024, Conference Proceedings, pp. 17 868–17 879
2024
-
[18]
Research on mechanical properties and model parameters of 3d printed tpu material,
B. Xie, M. Jin, Z. Yang, J. Duan, M. Qu, and J. Li, “Research on mechanical properties and model parameters of 3d printed tpu material,” Journal of Engineering Design, vol. 30, no. 4, pp. 419–428, 2023
2023
-
[19]
SAM 3D: 3Dfy Anything in Images
S. D. Teamet al., “Sam 3d: 3dfy anything in images,”arXiv preprint arXiv:2511.16624, 2025
work page internal anchor Pith review arXiv 2025
-
[20]
Manipulating and grasping forces in manipulation by multifingered robot hands,
T. Yoshikawa and K. Nagai, “Manipulating and grasping forces in manipulation by multifingered robot hands,”IEEE Transactions on Robotics and Automation, vol. 7, no. 1, pp. 67–77, 1991
1991
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.