Recognition: unknown
Learning Generalizable Action Representations via Pre-training AEMG
Pith reviewed 2026-05-07 17:08 UTC · model grok-4.3
The pith
AEMG pre-trains EMG signals as a cross-device physiological language using a contraction tokenizer to improve generalization in motor intent decoding.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By treating EMG signals linguistically and pre-training on a massive cross-device dataset, AEMG learns representations that generalize across subjects, devices, and tasks, achieving 5.79-9.25% higher zero-shot leave-one-subject-out accuracy than existing methods and over 90% performance in few-shot settings using only 5% of target data.
What carries the argument
The Neuromuscular Contraction Tokenizer (NCT), which converts discrete muscle contractions into structural words and temporal activation patterns into coherent sentences to support linguistic-style pre-training on EMG data.
If this is right
- Zero-shot leave-one-subject-out accuracy improves by 5.79-9.25% over six state-of-the-art baselines.
- Few-shot adaptation reaches more than 90% accuracy using only 5% of target user data.
- Seamless transfer occurs across arbitrary channel topologies and sampling rates.
- A single pre-trained model can serve as a foundation for multiple EMG applications without repeated per-user training.
Where Pith is reading between the lines
- The linguistic treatment of EMG may extend to other time-series biosignals to create unified foundation models.
- Prosthetic and human-computer interface systems could reduce per-user calibration time substantially.
- Scaling the cross-device vocabulary further might yield additional gains in rare or complex action classes.
Load-bearing premise
That EMG signals contain consistent linguistic structures across subjects and devices that can be tokenized without losing information needed to distinguish different actions.
What would settle it
If a pre-trained AEMG model shows no accuracy gain or a loss relative to non-pretrained baselines when tested on a completely new device or subject cohort, the claimed generalization benefit would not hold.
Figures
read the original abstract
A fundamental role in decoding human motor intent and enabling intuitive human-computer interaction is played by electromyography (EMG). However, its generalization capability across subjects, devices, and tasks remains substantially limited by data heterogeneity, label scarcity, and the lack of a unified representational framework. To bridge this gap, we propose Any Electromyography (AEMG), the first large-scale, self-supervised representation learning framework for EMG. AEMG reconceptualizes neuromuscular dynamics linguistically, utilizing a novel Neuromuscular Contraction Tokenizer (NCT) to translate discrete muscle contractions into structural words and temporal activation patterns into coherent sentences. Furthermore, we compile the largest cross-device EMG signal vocabulary to date, enabling seamless transfer across arbitrary channel topologies and sampling rates. Experiments demonstrate that AEMG improves the zero-shot leave-one-subject-out (LOSO) accuracy by 5.79-9.25% compared to six state-of-the-art baselines, and achieves more than 90% few-shot adaptation performance with only 5% of target user data. Our work has proposed the concept of EMG signals as a cross-device physiological language, learned their grammar from massive amounts of data, and laid the groundwork for a single-training, universally applicable EMG foundation model.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes AEMG, the first large-scale self-supervised representation learning framework for EMG signals. It reconceptualizes neuromuscular dynamics linguistically via a novel Neuromuscular Contraction Tokenizer (NCT) that discretizes muscle contractions into structural words and temporal activation patterns into sentences. A large cross-device EMG signal vocabulary is compiled to support transfer across arbitrary channel topologies and sampling rates. Experiments are reported to show 5.79-9.25% gains in zero-shot leave-one-subject-out (LOSO) accuracy over six baselines and >90% few-shot adaptation performance using only 5% of target user data.
Significance. If the reported gains hold and are attributable to the linguistic modeling rather than dataset scale alone, the work has high significance for EMG-based motor intent decoding and human-computer interaction. The compilation of the largest cross-device EMG vocabulary to date and the self-supervised pre-training approach directly address label scarcity and heterogeneity; these are concrete strengths that could support future foundation models. The linguistic analogy provides a fresh conceptual lens even if the empirical validation requires strengthening.
major comments (2)
- [Abstract] Abstract: The headline claims of 5.79-9.25% zero-shot LOSO accuracy improvement and >90% few-shot performance with 5% data are stated without any reference to experimental protocol, dataset details (subjects, devices, tasks), statistical tests, or ablation results. This absence is load-bearing for the central generalization claim.
- [NCT description] Section describing the Neuromuscular Contraction Tokenizer (NCT): The premise that NCT produces a lossless, subject- and device-invariant linguistic representation (words from contractions, sentences from patterns) is central to attributing gains to the proposed grammar rather than other pre-training choices, yet no analysis of information loss from discretization, fixed thresholds, or quantization, nor ablations against non-linguistic baselines, is supplied.
minor comments (1)
- [Abstract] The abstract uses 'AEMG' both for the framework and implicitly for the signals; a brief clarification of acronym scope would improve readability.
Simulated Author's Rebuttal
We thank the referee for their insightful comments, which have helped us improve the clarity and rigor of our manuscript. We address each major comment in detail below, providing clarifications and indicating the revisions made to the paper.
read point-by-point responses
-
Referee: [Abstract] Abstract: The headline claims of 5.79-9.25% zero-shot LOSO accuracy improvement and >90% few-shot performance with 5% data are stated without any reference to experimental protocol, dataset details (subjects, devices, tasks), statistical tests, or ablation results. This absence is load-bearing for the central generalization claim.
Authors: We acknowledge the referee's point that the abstract lacks specific references to the experimental details. The manuscript body provides comprehensive descriptions of the datasets (including subject numbers, device types, and task specifications), the leave-one-subject-out protocol, and comparisons with baselines. Statistical tests (paired t-tests) were used to validate the improvements. To address this, we have revised the abstract to briefly mention the key aspects of the evaluation protocol and datasets, ensuring the claims are better contextualized without exceeding length limits. revision: yes
-
Referee: [NCT description] Section describing the Neuromuscular Contraction Tokenizer (NCT): The premise that NCT produces a lossless, subject- and device-invariant linguistic representation (words from contractions, sentences from patterns) is central to attributing gains to the proposed grammar rather than other pre-training choices, yet no analysis of information loss from discretization, fixed thresholds, or quantization, nor ablations against non-linguistic baselines, is supplied.
Authors: We agree that additional analysis would strengthen the attribution of gains to the linguistic modeling. The NCT uses fixed thresholds derived from neuromuscular physiology to ensure invariance, and the cross-device vocabulary addresses heterogeneity in channel topologies and sampling rates. However, explicit quantification of information loss due to discretization and ablations against non-linguistic baselines were not included. We will incorporate a new analysis section quantifying reconstruction error from the tokenizer and an ablation comparing NCT to a non-linguistic baseline (e.g., direct feature extraction without tokenization) in the revised manuscript. revision: yes
Circularity Check
No circularity: empirical gains rest on external baselines, not self-referential definitions or fitted inputs.
full rationale
The paper presents AEMG as a self-supervised framework that tokenizes EMG via NCT into words/sentences and pretrains on a compiled cross-device vocabulary. Its strongest claims are zero-shot LOSO accuracy improvements (5.79-9.25%) and few-shot results (>90% with 5% data) measured against six independent state-of-the-art baselines. No equations, parameter-fitting steps, or self-citations are shown that reduce any reported prediction or generalization result to a quantity defined in terms of itself. The NCT discretization and vocabulary construction are introduced as novel design choices whose validity is tested by downstream performance rather than assumed by construction. This is the common honest case of a self-contained empirical paper whose central results do not collapse to tautology.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption EMG neuromuscular dynamics can be tokenized into structural words (discrete contractions) and coherent sentences (temporal activation patterns) without critical information loss.
invented entities (2)
-
Neuromuscular Contraction Tokenizer (NCT)
no independent evidence
-
Cross-device EMG signal vocabulary
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Electromyography data for non-invasive naturally-controlled robotic hand prostheses.Scientific Data, 1(1):1–13, 2014
Manfredo Atzori, Arjan Gijsberts, Claudio Castellini, Bar- bara Caputo, Anne-Gabrielle Mittaz Hager, Simone Elsig, Giorgio Giatsidis, Franco Bassetto, and Henning M ¨uller. Electromyography data for non-invasive naturally-controlled robotic hand prostheses.Scientific Data, 1(1):1–13, 2014. 1, 2
2014
-
[2]
L. I. Barona L ´opez, F. M. Ferri, J. Zea, ´A. L. Val- divieso Caraguay, and M. E. Benalc´azar. Cnn-lstm and post- processing for emg-based hand gesture recognition.Intelli- gent Systems with Applications, 22:200352, 2024. 1, 2
2024
-
[3]
Benalcazar, L
M.E. Benalcazar, L. Barona, L. Valdivieso, X. Aguas, and J. Zea. Emg-epn-612 dataset, 2020. Zenodo. 6, 8
2020
-
[4]
Campanini, C
I. Campanini, C. Disselhorst-Klug, W. Z. Rymer, and R. Merletti. Surface emg in clinical assessment and neuroreha- bilitation: barriers limiting its use.Frontiers in Neurology, 11:934, 2020. 1
2020
-
[5]
Chapelle and A
O. Chapelle and A. Zien. Semi-supervised classification by low density separation. InInternational Workshop on Arti- ficial Intelligence and Statistics, pages 57–64. PMLR, 2005. 2
2005
-
[6]
Ulysse C ˆot´e-Allard, Cheikh Latyr Fall, Alexandre Drouin, Alexandre Campeau-Lecours, Cl ´ement Gosselin, Kyrre Glette, Franc ¸ois Laviolette, and Benoit Gosselin. Deep learn- ing for electromyographic hand gesture signal classification using transfer learning.IEEE Transactions on Neural Sys- tems and Rehabilitation Engineering, 27(4):760–771, 2019. 2, 3
2019
-
[7]
Cote-Allard, G
U. Cote-Allard, G. Gagnon-Turcotte, A. Phinyomark, K. Glette, E. J. Scheme, F. Laviolette, and B. Gos- selin. Unsupervised domain adversarial self-calibration for electromyography-based gesture recognition.IEEE Access, 8:177941–177955, 2020. 2
2020
-
[8]
A. D. Degenhart, W. E. Bishop, E. R. Oby, E. C. Tyler- Kabara, S. M. Chase, A. P. Batista, and B. M. Yu. Stabi- lization of a brain–computer interface via the alignment of low-dimensional spaces of neural activity.Nature Biomedi- cal Engineering, 4(7):672–685, 2020. 4
2020
-
[9]
Z. Deng, Y . Luo, and J. Zhu. Cluster alignment with a teacher for unsupervised domain adaptation. InICCV, pages 9944– 9953, 2019. 2
2019
-
[10]
M. D. Dere and B. Lee. A novel approach to surface emg- based gesture classification using a vision transformer inte- grated with convolutive blind source separation.IEEE Jour- nal of Biomedical and Health Informatics, 2023. 2
2023
-
[11]
N. A. Dimitrova and G. V . Dimitrov. Interpretation of emg changes with fatigue: facts, pitfalls, and fallacies.Journal of Electromyography and Kinesiology, 13(1):13–36, 2003. 2
2003
-
[12]
Surface emg-based intersession gesture recognition enhanced by deep domain adaptation.Sensors, 17(3):458,
Yu Du, Wenguang Jin, Wentao Wei, Yu Hu, and Weidong Geng. Surface emg-based intersession gesture recognition enhanced by deep domain adaptation.Sensors, 17(3):458,
-
[13]
Y . Du, Y . Chen, F. Cui, X. Zhang, and C. Wang. Cross- domain error minimization for unsupervised domain adapta- tion. InDatabase Systems for Advanced Applications: 26th International Conference, DASFAA 2021, Proceedings, Part II, pages 429–448. Springer, 2021. 2, 6
2021
-
[14]
Gesture recognition by instantaneous surface emg images.Scientific Reports, 6(1):36571, 2016
Weidong Geng, Yu Du, Wenguang Jin, Wentao Wei, Yu Hu, and Jiajun Li. Gesture recognition by instantaneous surface emg images.Scientific Reports, 6(1):36571, 2016. 6
2016
-
[15]
Hou, Y .-H
C.-A. Hou, Y .-H. H. Tsai, Y .-R. Yeh, and Y .-C. F. Wang. Un- supervised domain adaptation with label and structural con- sistency.IEEE TIP, 25(12):5552–5562, 2016. 2, 8
2016
-
[16]
N. M. Hye, U. Hany, S. Chakravarty, L. Akter, and I. Ahmed. Artificial intelligence for semg-based muscular movement recognition for hand prosthesis.IEEE Access, 2023. 2
2023
-
[17]
Large brain model for learning generic representations with tremendous EEG data in BCI
Wei-Bang Jiang, Li-Ming Zhao, and Bao-Liang Lu. Large brain model for learning generic representations with tremendous EEG data in BCI. InThe Twelfth International Conference on Learning Representations, 2024. 4, 5
2024
-
[18]
Xinyu Jiang, Xiangyu Liu, Jiahao Fan, Xinming Ye, Chenyun Dai, Edward A Clancy, Metin Akay, and Wei Chen. Open access dataset, toolbox and benchmark process- ing results of high-density surface electromyogram record- ings.IEEE Transactions on Neural Systems and Rehabilita- tion Engineering, 29:1035–1046, 2021. 3
2021
- [19]
-
[20]
Kaifosh, T
P. Kaifosh, T. R. Reardon, and CTRL-labs at Reality Labs. A generic non-invasive neuromotor interface for human- computer interaction.Nature, pages 1–10, 2025. 1, 3, 5
2025
-
[21]
E. R. Kandel, J. H. Schwartz, T. M. Jessell, S. A. Siegel- baum, and A. J. Hudspeth.Principles of Neural Science, Fifth Edition. McGraw-Hill Medical, 2000. 1
2000
-
[22]
Krilova, I
N. Krilova, I. Kastalskiy, V . Kazantsev, V . A. Makarov, and S. Lobov. Emg data for gestures. UCI Machine Learn- ing Repository, 2019. DOI:https://doi.org/10. 24432/C5ZP5C. 3
2019
-
[23]
Y . Liu, X. Peng, Y . Tan, T. T. Oyemakinde, M. Wang, G. Li, and X. Li. A novel unsupervised dynamic feature domain adaptation strategy for cross-individual myoelectric gesture recognition.Journal of Neural Engineering, 20(6):066044,
-
[24]
Merletti and D
R. Merletti and D. Farina.Surface Electromyography: Phys- iology, Engineering, and Applications. John Wiley & Sons,
-
[25]
Training lan- guage models to follow instructions with human feedback
Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Car- roll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training lan- guage models to follow instructions with human feedback. NeurIPS, 35:27730–27744, 2022. 1
2022
-
[26]
Dataset for multichannel surface electromyo- graphy (semg) signals of hand gestures.Data in Brief, 41: 107921, 2022
Mehmet Akif Ozdemir, Deniz Hande Kisa, Onan Guren, and Aydin Akan. Dataset for multichannel surface electromyo- graphy (semg) signals of hand gestures.Data in Brief, 41: 107921, 2022. 3
2022
-
[27]
M. A. Ozdemir, D. H. Kisa, O. Guren, and A. Akan. Hand gesture classification using time-frequency images and trans- fer learning based on cnn.Biomedical Signal Processing and Control, 77:103787, 2022. 1
2022
-
[28]
Trans- former convolutional neural networks for automated artifact detection in scalp eeg
Wei Yan Peh, Yuanyuan Yao, and Justin Dauwels. Trans- former convolutional neural networks for automated artifact detection in scalp eeg. In2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pages 3599–3602. IEEE, 2022. 1
2022
-
[29]
Phinyomark, P
A. Phinyomark, P. Phukpattaranont, and C. Limsakul. Fea- ture reduction and selection for emg signal classification.Ex- pert Systems with Applications, 39(8):7420–7431, 2012. 2
2012
-
[30]
Comparison of six electromyography acquisition setups on hand movement classification tasks.PLoS ONE, 12(10): e0186132, 2017
Stefano Pizzolato, Luca Tagliapietra, Matteo Cognolato, Monica Reggiani, Henning M ¨uller, and Manfredo Atzori. Comparison of six electromyography acquisition setups on hand movement classification tasks.PLoS ONE, 12(10): e0186132, 2017. 6, 8
2017
- [31]
-
[32]
Simar, M
C. Simar, M. Colot, A.-M. Cebolla, M. Petieau, G. Cheron, and G. Bontempi. Machine learning for hand pose classifica- tion from phasic and tonic emg signals during bimanual ac- tivities in virtual reality.Front. Neurosci., 18:1329411, 2024. 6, 8
2024
-
[33]
A systematic review on surface electromyography-based clas- sification system for identifying hand and finger movements
Afroza Sultana, Farruk Ahmed, and Md Shafiul Alam. A systematic review on surface electromyography-based clas- sification system for identifying hand and finger movements. Healthcare Analytics, 3:100126, 2023. 6
2023
-
[34]
H. Tang, K. Chen, and K. Jia. Unsupervised domain adap- tation via structurally regularized deep clustering. InCVPR, pages 8725–8735, 2020. 2, 8
2020
-
[35]
Toro-Ossaba, J
A. Toro-Ossaba, J. Jaramillo-Tigreros, J.C. Tejada, A. Pe ˜na, A. L´opez-Gonz´alez, and R.A. Castanho. Lstm recurrent neu- ral network for hand gesture recognition using emg signals. Appl. Sci., 12(19):9700, 2022. 6, 8
2022
-
[36]
Vaswani, N
A. Vaswani, N. Shazeer, N. Parmar, et al. Attention is all you need.NeurIPS, 30, 2017. 4
2017
-
[37]
L. Wang, X. Li, Z. Chen, Z. Sun, J. Xue, W. Sun, and S. Zhang. A novel hybrid unsupervised domain adaptation method for cross-subject joint angle estimation from surface electromyography.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 29:1451–1461, 2021. 2
2021
-
[38]
Z. Wang, H. Wan, L. Meng, Z. Zeng, M. Akay, C. Chen, and W. Chen. Optimization of inter-subject semg-based hand gesture recognition tasks using unsupervised domain adapta- tion techniques.Biomedical Signal Processing and Control, 92:106086, 2024. 6
2024
-
[39]
M. Xu, X. Chen, Y . Ruan, and X. Zhang. Cross-user elec- tromyography pattern recognition based on a novel spatial- temporal graph convolutional network.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2023. 2
2023
-
[40]
J. Yang, M. Soh, D. J. Weber, V . Lieu, and Z. Erickson. Emgbench: Benchmarking out-of-distribution generaliza- tion and adaptation for electromyography.arXiv preprint,
- [41]
-
[42]
Zhang, T
Y . Zhang, T. Liu, M. Long, and M. Jordan. Bridging the- ory and algorithm for domain adaptation. InInternational Conference on Machine Learning, pages 7404–7413. PMLR,
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.