pith. sign in

arxiv: 2606.31236 · v1 · pith:24GBCEULnew · submitted 2026-06-30 · 💻 cs.RO

TactX: Learning Shared Tactile Representations Across Diverse Sensors

Pith reviewed 2026-07-01 05:21 UTC · model grok-4.3

classification 💻 cs.RO
keywords tactile representationsshared latent spacesensor transfercontact-rich manipulationzero-shot transfermultimodal tactilerobot manipulationpolicy transfer
0
0 comments X

The pith

TactX learns a shared latent space across tactile sensors of different types, enabling zero-shot policy transfer between them.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes TactX to learn transferable tactile representations across resistive, magnetic, and vision-based sensors. It uses paired contact data to train modality-specific encoders that map inputs to a common latent space. This allows policies trained on one sensor to be applied directly to another. Experiments on four manipulation tasks show improved success rates over vision-only approaches. This matters for making tactile sensing more practical across varied robot hardware.

Core claim

TactX maps heterogeneous tactile observations into a shared latent space through modality-specific encoders trained on paired contact data. Such paired interactions provide a natural alignment signal across modalities, and the encoders are jointly trained across all sensor pairs, inducing a consistent latent space for all sensor types. Our experiments show that TactX aligns tactile representations across sensors while preserving object-level contact information. Policies trained with one sensor transfer zero-shot to physically distinct sensors through the shared latent, improving the average success rate from 27.5% for vision-only policy to 45.9% on four contact-rich manipulation tasks.

What carries the argument

modality-specific encoders jointly trained on paired contact data to induce a consistent latent space across resistive, magnetic, and vision-based sensors

If this is right

  • Policies trained on data from one tactile sensor can be deployed on different sensors without retraining.
  • Success rates on pick-and-place, plug insertion, board wiping, and object reorientation tasks increase to an average of 45.9%.
  • The latent space supports both alignment across sensors and preservation of object contact details.
  • Manipulation policies become less dependent on specific tactile hardware.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Hardware choices for robots could become more flexible if sensors can be swapped without policy changes.
  • Collecting paired contact data might be a scalable way to align new sensor types in the future.
  • The approach could extend to other sensing modalities beyond tactile if similar pairing is possible.

Load-bearing premise

Paired contact interactions supply a sufficient natural alignment signal to induce a consistent latent space across all three transduction modalities while preserving object-level contact information.

What would settle it

Zero-shot transfer experiments yielding success rates no higher than the 27.5% vision-only baseline on the manipulation tasks would indicate the shared latent does not enable effective cross-sensor policy use.

Figures

Figures reproduced from arXiv: 2606.31236 by Carmelo Sferrazza, Junsung Park, Sachin Bhadang, Sha Yi, Xiaolong Wang.

Figure 1
Figure 1. Figure 1: TACTX learns a shared latent representation that aligns heterogeneous tactile sensors and enables zero-shot transfer of tactile-conditioned policies. Abstract: Tactile sensors provide critical information for contact-rich manipula￾tion, yet tactile representations and policies remain tightly coupled to each spe￾cific sensor, limiting transferability across robots and hardware platforms. We propose TACTX, a… view at source ↗
Figure 2
Figure 2. Figure 2: TACTX trains on paired contacts from two sensors at a time. Paired observations are encoded into a shared latent space, aligned with InfoNCE, and decoded through self- and cross-reconstruction. Other pairs are trained analogously, yielding a single latent space shared by all three sensors. (cross-reconstruction, e.g. gj (zi) → xj ): the latent from one finger must reconstruct the other finger’s ground-trut… view at source ↗
Figure 3
Figure 3. Figure 3: Transitive cross-sensor alignment. Cosine similarity along the Daimon→eFlesh→FlexiTac path measures global latent alignment, with dashed lines in￾dicating the mean for each method. We first evaluate whether TACTX aligns tactile observations from different sensing modalities into a shared latent space. We compare TACTX with three objective variants: a reconstruction￾only model (using Eq. (2)), a contrastive… view at source ↗
Figure 4
Figure 4. Figure 4: Sensor invariance and semantic preservation in the shared latent space. Sensor-prediction accu￾racy measures whether sensor identity remains recoverable from frozen latents, where lower values closer to the 33.3% chance level indicate stronger sensor invariance. Object-classification accuracy evaluates whether object-level information is preserved, where “Self” denotes training and testing on the same sens… view at source ↗
Figure 5
Figure 5. Figure 5: Self- and cross-reconstruction from the shared latent. We visualize representative validation con￾tacts from sphere, plane, and circle indentors. For each sensor, the first column is the ground-truth observation, the diagonal entries are self-reconstructions, and the off-diagonal entries are cross-reconstructions decoded from the nearest latent representations of the other sensors in the validation set. ti… view at source ↗
Figure 6
Figure 6. Figure 6: Downstream manipulation tasks. We evaluate zero-shot tactile policy transfer on four contact-rich tasks: plug insertion, board wiping, pick-and-place, and object reorientation. transfer is its sensitivity to the contact threshold: we use three separate sensor-specific thresholds that are held fixed across all tasks (Appendix D.3), and this threshold mismatch between tasks leads to higher variance and incon… view at source ↗
Figure 7
Figure 7. Figure 7: The three tactile sensors used in TACTX, each spanning a different transduction modality. All three are visually matched (black TPU/tape/elastomer) to remove cosmetic shortcuts and have roughly commensurate active sensing areas. Mounting and pairing. Two sensors are mounted on opposing fingers of a Franka parallel-jaw gripper; the third is swapped in for separate runs. We cover all 3 2  × 2 = 6 configurat… view at source ↗
Figure 8
Figure 8. Figure 8: The 10 3D-printed pretraining objects used for paired data collection, spanning point, edge, and area contact ge￾ometries. Objects and protocol. Pretraining data uses 10 3D￾printed objects ( [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗
read the original abstract

Tactile sensors provide critical information for contact-rich manipulation, yet tactile representations and policies remain tightly coupled to each specific sensor, limiting transferability across robots and hardware platforms. We propose TactX, a framework for learning a transferable tactile representation across sensors spanning three fundamentally different transduction modalities: resistive, magnetic, and vision-based. TactX maps heterogeneous tactile observations into a shared latent space through modality-specific encoders trained on paired contact data. Such paired interactions provide a natural alignment signal across modalities, and the encoders are jointly trained across all sensor pairs, inducing a consistent latent space for all sensor types. Our experiments show that TactX aligns tactile representations across sensors while preserving object-level contact information, as evidenced by sensor-identity prediction and object classification in the learned latent space. We evaluate TactX on four contact-rich manipulation tasks: pick-and-place, plug insertion, board wiping, and object reorientation, and show that policies trained with one sensor transfer zero-shot to physically distinct sensors through the shared latent. This improves the average success rate from 27.5% for vision-only policy to 45.9%, providing a step toward sensor-agnostic tactile manipulation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes TactX, a framework that learns a shared latent space for tactile representations across three transduction modalities (resistive, magnetic, vision-based) by training modality-specific encoders jointly on paired contact interactions. It claims this alignment preserves object-level contact information (verified via sensor-identity and object classification) and enables zero-shot policy transfer across sensors on four contact-rich tasks, raising average success from 27.5% (vision-only baseline) to 45.9%.

Significance. If the zero-shot transfer result holds under rigorous controls, the work would address a practical barrier in contact-rich manipulation by decoupling policies from specific sensor hardware, potentially enabling more reusable tactile skills across robot platforms.

major comments (2)
  1. [Method description (alignment signal)] The central zero-shot transfer claim rests on the assumption that joint training on paired contact tuples alone induces functionally interchangeable latents across modalities with non-overlapping spatial support, noise spectra, and dynamic range. No cycle-consistency, invariance, or distribution-matching term is described that would enforce this equivalence outside the paired set; without such a mechanism the policy (trained only on one sensor's latents) can encounter out-of-distribution inputs from a new sensor.
  2. [Experiments (success-rate results)] The reported improvement from 27.5% to 45.9% is presented without accompanying details on trial counts, statistical tests, variance across seeds, or ablations that isolate the contribution of the shared latent versus other factors (e.g., sensor-specific fine-tuning or task-specific data). These omissions are load-bearing for the transfer claim.
minor comments (2)
  1. [Abstract / Method] The abstract and method sections would benefit from an explicit statement of the loss function(s) used for joint encoder training and the precise definition of 'paired contact tuples' (e.g., how temporal and spatial alignment is performed across modalities).
  2. [Experiments] Figure captions and table headers should clarify whether the reported success rates are means over multiple runs and whether the vision-only baseline uses the same policy architecture as the TactX variants.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We respond point-by-point to the major comments below.

read point-by-point responses
  1. Referee: [Method description (alignment signal)] The central zero-shot transfer claim rests on the assumption that joint training on paired contact tuples alone induces functionally interchangeable latents across modalities with non-overlapping spatial support, noise spectra, and dynamic range. No cycle-consistency, invariance, or distribution-matching term is described that would enforce this equivalence outside the paired set; without such a mechanism the policy (trained only on one sensor's latents) can encounter out-of-distribution inputs from a new sensor.

    Authors: Paired contact tuples correspond to identical physical interactions observed by different sensors, supplying direct supervision that maps each modality's reading of the same event into a common latent. Joint optimization of all modality-specific encoders across every sensor pair further constrains the latent space to be consistent, as any deviation would increase the joint loss. The manuscript already demonstrates that this produces latents preserving object-level contact information via the reported sensor-identity and object-classification probes. The zero-shot policy transfer experiments on four tasks provide empirical evidence that the resulting representations are interchangeable in practice; no explicit cycle-consistency term is required because the multi-pair paired supervision itself enforces the necessary alignment. revision: no

  2. Referee: [Experiments (success-rate results)] The reported improvement from 27.5% to 45.9% is presented without accompanying details on trial counts, statistical tests, variance across seeds, or ablations that isolate the contribution of the shared latent versus other factors (e.g., sensor-specific fine-tuning or task-specific data). These omissions are load-bearing for the transfer claim.

    Authors: We agree that the current manuscript omits these experimental details. In the revision we will report the exact number of trials per task and sensor combination, include standard deviations across random seeds, add appropriate statistical tests, and provide ablations that isolate the contribution of the shared latent from other factors such as task-specific data or sensor-specific fine-tuning. revision: yes

Circularity Check

0 steps flagged

No circularity: alignment induced by external paired data, not internal definitions

full rationale

The provided abstract and context contain no equations, fitted parameters, or self-citations. The shared latent space is induced by joint training of modality-specific encoders on externally collected paired contact tuples, which constitute an independent alignment signal rather than a quantity defined in terms of the latent itself. No load-bearing step reduces to a self-definition, a renamed prediction, or an imported uniqueness theorem. The zero-shot transfer claim is evaluated against external manipulation benchmarks, rendering the derivation self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no equations, training losses, or architectural details, so free parameters, axioms, and invented entities cannot be audited; the central alignment claim rests on the unstated premise that paired data is available and sufficient.

pith-pipeline@v0.9.1-grok · 5750 in / 1141 out tokens · 29743 ms · 2026-07-01T05:21:18.748415+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

57 extracted references · 49 canonical work pages · 11 internal anchors

  1. [1]

    Calandra, A

    R. Calandra, A. Owens, D. Jayaraman, J. Lin, W. Yuan, J. Malik, E. H. Adelson, and S. Levine. More than a feeling: Learning to grasp and regrasp using vision and touch.IEEE Robotics and Automation Letters, 3(4):3300–3307, Oct. 2018. ISSN 2377-3774. doi:10.1109/lra.2018. 2852779. URLhttp://dx.doi.org/10.1109/LRA.2018.2852779

  2. [2]

    Z.-H. Yin, B. Huang, Y . Qin, Q. Chen, and X. Wang. Rotating without seeing: Towards in-hand dexterity through touch, 2023. URLhttps://arxiv.org/abs/2303.10880

  3. [3]

    Palenicek, T

    D. Palenicek, T. Gruner, T. Schneider, A. B¨ohm, J. Lenz, I. Pfenning, E. Kr¨amer, and J. Peters. Learning tactile insertion in the real world, 2024. URLhttps://arxiv.org/abs/2405. 00383

  4. [4]

    Oller, D

    M. Oller, D. Berenson, and N. Fazeli. Tactile-driven non-prehensile object manipulation via extrinsic contact mode control, 2024. URLhttps://arxiv.org/abs/2405.18214

  5. [5]

    F. Yang, C. Feng, Z. Chen, H. Park, D. Wang, Y . Dou, Z. Zeng, X. Chen, R. Gangopadhyay, A. Owens, and A. Wong. Binding touch to everything: Learning unified multimodal tactile representations, 2024. URLhttps://arxiv.org/abs/2401.18084

  6. [6]

    J. Zhao, Y . Ma, L. Wang, and E. H. Adelson. Transferable tactile transformers for representa- tion learning across diverse sensors and tasks, 2024. URLhttps://arxiv.org/abs/2406. 13640

  7. [7]

    R. Feng, J. Hu, W. Xia, T. Gao, A. Shen, Y . Sun, B. Fang, and D. Hu. Anytouch: Learning unified static-dynamic representation across multiple visuo-tactile sensors, 2025. URLhttps: //arxiv.org/abs/2502.12191

  8. [8]

    Higuera, A

    C. Higuera, A. Sharma, C. K. Bodduluri, T. Fan, P. Lancaster, M. Kalakrishnan, M. Kaess, B. Boots, M. Lambeta, T. Wu, and M. Mukadam. Sparsh: Self-supervised touch representa- tions for vision-based tactile sensing, 2024. URLhttps://arxiv.org/abs/2410.24090

  9. [9]

    Rodriguez, Y

    S. Rodriguez, Y . Dou, M. Oller, A. Owens, and N. Fazeli. Cross-sensor touch generation,

  10. [10]

    URLhttps://arxiv.org/abs/2510.09817

  11. [11]

    W. Yuan, S. Dong, and E. H. Adelson. Gelsight: High-resolution robot tactile sensors for estimating geometry and force.Sensors (Basel, Switzerland), 17, 2017. URLhttps://api. semanticscholar.org/CorpusID:3474913

  12. [12]

    Lambeta, P.-W

    M. Lambeta, P.-W. Chou, S. Tian, B. Yang, B. Maloon, V . R. Most, D. Stroud, R. Santos, A. Byagowi, G. Kammerer, D. Jayaraman, and R. Calandra. Digit: A novel design for a low- cost compact high-resolution tactile sensor with application to in-hand manipulation.IEEE Robotics and Automation Letters, 5(3):3838–3845, 2020. ISSN 2377-3774. doi:10.1109/lra. 20...

  13. [13]

    Ward-Cherrier, N

    B. Ward-Cherrier, N. Pestell, L. Cramphorn, B. Winstone, M. Giannaccini, J. Rossiter, and N. Lepora. The tactip family: Soft optical tactile sensors with 3d-printed biomimetic mor- phologies.Soft Robotics, 5, 01 2018. doi:10.1089/soro.2017.0052

  14. [14]

    C. Lin, H. Zhang, J. Xu, L. Wu, and H. Xu. 9dtact: A compact vision-based tactile sensor for accurate 3d shape reconstruction and generalizable 6d force estimation, 2023. URLhttps: //arxiv.org/abs/2308.14277

  15. [15]

    T. Tomo, A. Schmitz, W. Wong, H. Kristanto, S. Somlor, J. Hwang, L. Jamone, and S. Sugano. Covering a robot fingertip with uskin: A soft electronic skin with distributed 3-axis force sensitive elements for robot hands.IEEE Robotics and Automation Letters, PP:1–1, 08 2017. doi:10.1109/LRA.2017.2734965. 9

  16. [16]

    Bhirangi, T

    R. Bhirangi, T. Hellebrekers, C. Majidi, and A. Gupta. Reskin: versatile, replaceable, lasting tactile skins, 2022. URLhttps://arxiv.org/abs/2111.00071

  17. [17]

    Pattabiraman, Z

    V . Pattabiraman, Z. Huang, D. Panozzo, D. Zorin, L. Pinto, and R. Bhirangi. eflesh: Highly customizable magnetic touch sensing using cut-cell microstructures, 2025. URLhttps:// arxiv.org/abs/2506.09994

  18. [18]

    FlexiTac: A Low-Cost, Open-Source, Scalable Tactile Sensing Solution for Robotic Systems

    B. Huang and Y . Li. Flexitac: A low-cost, open-source, scalable tactile sensing solution for robotic systems, 2026. URLhttps://arxiv.org/abs/2604.28156

  19. [19]

    Khamis, R

    H. Khamis, R. Albero, M. Salerno, A. Shah Idil, and A. Loizou. Papillarray: An incipient slip sensor for dexterous robotic or prosthetic manipulation – design and prototype validation. Sensors and Actuators A: Physical, 270, 12 2017. doi:10.1016/j.sna.2017.12.058

  20. [20]

    Representation Learning with Contrastive Predictive Coding

    A. van den Oord, Y . Li, and O. Vinyals. Representation learning with contrastive predictive coding, 2019. URLhttps://arxiv.org/abs/1807.03748

  21. [21]

    T. Z. Zhao, V . Kumar, S. Levine, and C. Finn. Learning fine-grained bimanual manipulation with low-cost hardware, 2023. URLhttps://arxiv.org/abs/2304.13705

  22. [22]

    H. Qi, B. Yi, S. Suresh, M. Lambeta, Y . Ma, R. Calandra, and J. Malik. General in-hand object rotation with vision and touch, 2023. URLhttps://arxiv.org/abs/2309.09979

  23. [23]

    T. Lin, Y . Zhang, Q. Li, H. Qi, B. Yi, S. Levine, and J. Malik. Learning visuotactile skills with two multifingered hands, 2024. URLhttps://arxiv.org/abs/2404.16823

  24. [24]

    Z.-H. Yin, C. Wang, L. Pineda, F. Hogan, K. Bodduluri, A. Sharma, P. Lancaster, I. Prasad, M. Kalakrishnan, J. Malik, M. Lambeta, T. Wu, P. Abbeel, and M. Mukadam. Dexteritygen: Foundation controller for unprecedented dexterity, 2025. URLhttps://arxiv.org/abs/ 2502.04307

  25. [25]

    X. Liu, H. Wang, and L. Yi. Dexndm: Closing the reality gap for dexterous in-hand rotation via joint-wise neural dynamics model, 2025. URLhttps://arxiv.org/abs/2510.08556

  26. [26]

    Jiang, S

    S. Jiang, S. Zhao, Y . Fan, and P. Yin. Gelfusion: Enhancing robotic manipulation under visual constraints via visuotactile fusion, 2025. URLhttps://arxiv.org/abs/2505.07455

  27. [27]

    Y . She, S. Wang, S. Dong, N. Sunil, A. Rodriguez, and E. Adelson. Cable manipulation with a tactile-reactive gripper, 2020. URLhttps://arxiv.org/abs/1910.02860

  28. [28]

    F. R. Hogan, M. Bauza, O. Canal, E. Donlon, and A. Rodriguez. Tactile regrasp: Grasp adjustments via simulated tactile transformations, 2018. URLhttps://arxiv.org/abs/ 1803.01940

  29. [29]

    GelSlim: A High-Resolution, Compact, Robust, and Calibrated Tactile-sensing Finger

    E. Donlon, S. Dong, M. Liu, J. Li, E. Adelson, and A. Rodriguez. Gelslim: A high-resolution, compact, robust, and calibrated tactile-sensing finger, 2018. URLhttps://arxiv.org/abs/ 1803.00628

  30. [30]

    DM-Tac W: High-resolution vision-based tactile sensor.https://www

    Daimon Robotics. DM-Tac W: High-resolution vision-based tactile sensor.https://www. dmrobot.com/en/, 2025. Accessed: 2026-05-28

  31. [31]

    Soft-bubble: A highly compliant dense geometry tactile sensor for robot manipulation

    A. Alspach, K. Hashimoto, N. Kuppuswamy, and R. Tedrake. Soft-bubble: A highly compliant dense geometry tactile sensor for robot manipulation, 2019. URLhttps://arxiv.org/abs/ 1904.02252

  32. [32]

    W. K. Do and M. K. III. Densetact: Optical tactile sensor for dense shape reconstruction, 2022. URLhttps://arxiv.org/abs/2201.01367

  33. [33]

    Bhirangi, V

    R. Bhirangi, V . Pattabiraman, E. Erciyes, Y . Cao, T. Hellebrekers, and L. Pinto. Anyskin: Plug- and-play skin sensing for robotic touch, 2024. URLhttps://arxiv.org/abs/2409.08276. 10

  34. [34]

    Wettels, V

    N. Wettels, V . Santos, R. Johansson, and G. Loeb. Biomimetic tactile sensor array.Advanced Robotics, 22:829–849, 08 2008. doi:10.1163/156855308X314533

  35. [35]

    M. S. Li and H. S. Stuart. Acoustac: Tactile sensing with acoustic resonance for electronics- free soft skin, 2023. URLhttps://arxiv.org/abs/2307.09730

  36. [36]

    Zhang, D.-G

    K. Zhang, D.-G. Kim, E. T. Chang, H.-H. Liang, Z. He, K. Lampo, P. Wu, I. Kymissis, and M. Ciocarlie. Vibecheck: Using active acoustic tactile sensing for contact-rich manipulation,

  37. [37]

    URLhttps://arxiv.org/abs/2504.15535

  38. [38]

    M. A. Lee, Y . Zhu, K. Srinivasan, P. Shah, S. Savarese, L. Fei-Fei, A. Garg, and J. Bohg. Making sense of vision and touch: Self-supervised learning of multimodal representations for contact-rich tasks, 2019. URLhttps://arxiv.org/abs/1810.10191

  39. [39]

    Sharma, C

    A. Sharma, C. Higuera, C. K. Bodduluri, Z. Liu, T. Fan, T. Hellebrekers, M. Lambeta, B. Boots, M. Kaess, T. Wu, F. R. Hogan, and M. Mukadam. Self-supervised perception for tactile skin covered dexterous hands, 2025. URLhttps://arxiv.org/abs/2505.11420

  40. [40]

    Higuera, A

    C. Higuera, A. Sharma, T. Fan, C. K. Bodduluri, B. Boots, M. Kaess, M. Lambeta, T. Wu, Z. Liu, F. R. Hogan, and M. Mukadam. Tactile beyond pixels: Multisensory touch representa- tions for robot manipulation, 2025. URLhttps://arxiv.org/abs/2506.14754

  41. [41]

    Z. Xu, R. Uppuluri, X. Zhang, C. Fitch, P. G. Crandall, W. Shou, D. Wang, and Y . She. Unit: Data efficient tactile representation with generalization to unseen objects. 2025. URLhttps: //arxiv.org/abs/2408.06481

  42. [42]

    TacO: Benchmarking Tactile Sensors for Object Manipulation

    A. Zorin, Z. Si, M. Park, J. Park, A. Buynitsky, S. Bhadang, T. Park, S. J. Yoon, Y .-L. Park, O. Kroemer, Z. Temel, M. T. Tolley, S. Yi, and X. Wang. Taco: Benchmarking tactile sensors for object manipulation, 2026. URLhttps://arxiv.org/abs/2605.21976

  43. [43]

    Jiang, Y

    G. Jiang, Y . Liang, J. Ye, J.-Y . Huang, C. Jing, R. Duan, P. Abbeel, X. Wang, and X. Zou. Cross-hand latent representation for vision-language-action models, 2026. URLhttps:// arxiv.org/abs/2603.10158

  44. [44]

    Bauer, E

    E. Bauer, E. Nava, and R. K. Katzschmann. Latent action diffusion for cross-embodiment manipulation, 2026. URLhttps://arxiv.org/abs/2506.14608

  45. [45]

    T. Wang, D. Bhatt, X. Wang, and N. Atanasov. Cross-embodiment robot manipulation skill transfer using latent space alignment, 2024. URLhttps://arxiv.org/abs/2406.01968

  46. [46]

    Dastider, H

    A. Dastider, H. Fang, and M. Lin. Cross-embodiment robotic manipulation synthesis via guided demonstrations through cyclevae and human behavior transformer, 2025. URLhttps: //arxiv.org/abs/2503.08622

  47. [47]

    Q. Bu, Y . Yang, J. Cai, S. Gao, G. Ren, M. Yao, P. Luo, and H. Li. Univla: Learning to act anywhere with task-centric latent actions, 2025. URLhttps://arxiv.org/abs/2505. 06111

  48. [48]

    Sensor-Invariant Tactile Representation

    H. Gupta, Y . Mo, S. Jin, and W. Yuan. Sensor-invariant tactile representation, 2025. URL https://arxiv.org/abs/2502.19638

  49. [49]

    R. Feng, Y . Zhou, S. Mei, D. Zhou, P. Wang, S. Cui, B. Fang, G. Yao, and D. Hu. Anytouch 2: General optical tactile representation learning for dynamic tactile perception, 2026. URL https://arxiv.org/abs/2602.09617

  50. [50]

    Rodriguez, Y

    S. Rodriguez, Y . Dou, W. van den Bogert, M. Oller, K. So, A. Owens, and N. Fazeli. Con- trastive touch-to-touch pretraining, 2024. URLhttps://arxiv.org/abs/2410.11834. 11

  51. [51]

    Z. Chen, F. Ni, K. Luo, Z. Wu, X. Zhang, E. Spyrakos-Papastavridis, L. Jamone, N. F. Lepora, J. Deng, and S. Luo. Uniforce: A unified latent force model for robot manipulation with diverse tactile sensors, 2026. URLhttps://arxiv.org/abs/2602.01153

  52. [52]

    Z. Chen, N. Ou, X. Zhang, Z. Wu, Y . Zhao, Y . Wang, E. S. Papastavridis, N. Lepora, L. Jamone, J. Deng, and S. Luo. Training tactile sensors to learn force sensing from each other, 2025. URL https://arxiv.org/abs/2503.01058

  53. [53]

    J. Hou, X. Zhou, Q. Yang, and A. J. Spiers. Unitac-nv: A unified tactile representation for non-vision-based tactile sensors, 2025. URLhttps://arxiv.org/abs/2506.19699

  54. [54]

    Z. Chen, N. Ou, X. Zhang, and S. Luo. Transforce: Transferable force prediction for vision- based tactile sensors with sequential image translation, 2025. URLhttps://arxiv.org/ abs/2409.09870

  55. [55]

    Y . Wi, J. Yin, E. Xiang, A. Sharma, J. Malik, M. Mukadam, N. Fazeli, and T. Hellebrekers. Tactalign: Human-to-robot policy transfer via tactile alignment, 2026. URLhttps://arxiv. org/abs/2602.13579

  56. [56]

    T. Chen, S. Kornblith, M. Norouzi, and G. Hinton. A simple framework for contrastive learning of visual representations, 2020. URLhttps://arxiv.org/abs/2002.05709

  57. [57]

    P. Wu, Y . Shentu, Z. Yi, X. Lin, and P. Abbeel. Gello: A general, low-cost, and intuitive tele- operation framework for robot manipulators, 2024. URLhttps://arxiv.org/abs/2309. 13037. 12 A Data Collection Details Sensors.To prevent too much visual change the eFlesh housing is 3D-printed in black TPU and the FlexiTac surface is covered with black anti-sli...