pith. sign in

arxiv: 2606.30309 · v1 · pith:GRKIMXUUnew · submitted 2026-06-29 · 💻 cs.CV

A Point Cloud Transformer for Remote Monitoring and Automated Assessment of Physical Rehabilitation Exercises

Pith reviewed 2026-06-30 06:53 UTC · model grok-4.3

classification 💻 cs.CV
keywords point cloud transformerrehabilitation exercise assessmentRGBD joint positionsaxial self-attentionautomated quality assessmentKimore datasetUI-PRMD datasetIRDS dataset
0
0 comments X

The pith

A point cloud transformer assesses rehabilitation exercises from joint positions captured by RGBD sensors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a transformer architecture that takes point clouds of joint positions from RGBD data and scores how correctly patients perform prescribed rehabilitation exercises. It adds curve-based feature aggregation to enrich the input and axial self-attention to highlight which joints matter most. The goal is to replace constant expert supervision with an automated system that works at home. Tests on the Kimore, UI-PRMD, and IRDS datasets show higher accuracy than earlier methods together with small model size and fast inference.

Core claim

A transformer-based framework for point clouds extracts relevant features from joint position data collected through RGBD sensors and assesses the quality of rehabilitation exercises, outperforming existing approaches while remaining practically relevant due to its small size, fast inference, and generalization on specific joints in similar exercises.

What carries the argument

Transformer architecture for point clouds that applies curve-based point-cloud feature aggregation to augment input information and axial self-attention to recognize important joints and their roles.

If this is right

  • Enables automated quality feedback during home-based rehabilitation without constant expert presence.
  • Highlights specific joints that contribute most to exercise scoring, allowing targeted user guidance.
  • Supports generalization to similar exercises that involve the same joints.
  • Runs with low computational cost, making deployment on modest hardware feasible.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same joint-focused attention mechanism could be tested on other motion-tracking tasks such as sports technique analysis.
  • Adding temporal modeling across exercise repetitions might further improve detection of form errors that appear only over time.
  • Real-world accuracy would still need confirmation against live clinician ratings on previously unseen patients.

Load-bearing premise

The joint position annotations and RGBD data in the three datasets accurately capture exercise quality differences without needing additional context such as patient-specific factors or real-time expert validation.

What would settle it

A new dataset of exercises performed by different patients, scored both by the model and by multiple independent human experts, would show whether model scores align with expert ratings.

read the original abstract

Rehabilitation exercises are essential in restoring lost physical functions of patients suffering from various diseases (e.g., Parkinson's, back pain). Carrying out these rehabilitation exercises, often prescribed by health experts, is costly, unavailable, and requires expert supervision. The availability of RGBD images and movement/position data of joints along with expert annotation of exercise data has prompted the use of automatic assessment of the quality of rehabilitation exercises, which is cost-effective and can be carried out at home. However, existing approaches do not extract relevant features, lack practical application, require expensive pre-processing, or overlook crucial features. This study proposes a transformer-based framework for point clouds to extract features and assess rehabilitation exercises by analyzing joint positions collected through RGBD data. We adapt and utilize a curve-based point-cloud feature aggregation technique to augment point-cloud information that aids model output. The transformer architecture also uses axial self-attention, recognizing important joints and their roles to assist users in performing the exercise better. The guided system outperforms existing approaches and is also practically relevant due to its small size, fast inference, and generalization on specific joints in similar exercises. We conduct our experiments on three crucial baseline datasets for rehabilitation exercises: Kimore, UI-PRMD, and IRDS.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper proposes a transformer-based point-cloud framework for automatic assessment of rehabilitation exercise quality from RGBD-derived joint positions. It adapts curve-based feature aggregation and axial self-attention to highlight relevant joints, evaluates on the Kimore, UI-PRMD, and IRDS datasets, and claims outperformance over prior methods together with practical advantages of small model size, fast inference, and generalization across similar exercises on specific joints.

Significance. If the performance claims are substantiated by properly validated labels and ablations, the approach could support scalable home-based monitoring systems that reduce reliance on in-person expert supervision for conditions such as Parkinson's or back pain.

major comments (2)
  1. [Datasets and Evaluation] § on Datasets (Kimore, UI-PRMD, IRDS): the central outperformance claim rests on expert-annotated quality labels as ground truth, yet the manuscript supplies no inter-rater reliability statistics, no controls for patient-specific confounders (age, pathology, prior injury), and no discussion of how these labels isolate exercise quality independent of such factors. If label noise or bias is present, the reported superiority and joint-generalization results may simply reflect dataset artifacts rather than learned features.
  2. [Results] Results section / tables: the abstract asserts quantitative outperformance, yet the provided manuscript excerpt contains no numerical metrics, baseline comparisons, ablation studies, or error analysis. Without these load-bearing elements, the superiority and practical-relevance claims cannot be evaluated.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on dataset validation and results presentation. We address each major comment below and outline planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Datasets and Evaluation] § on Datasets (Kimore, UI-PRMD, IRDS): the central outperformance claim rests on expert-annotated quality labels as ground truth, yet the manuscript supplies no inter-rater reliability statistics, no controls for patient-specific confounders (age, pathology, prior injury), and no discussion of how these labels isolate exercise quality independent of such factors. If label noise or bias is present, the reported superiority and joint-generalization results may simply reflect dataset artifacts rather than learned features.

    Authors: The quality labels originate from the original dataset publications, which describe expert annotation protocols. We agree that inter-rater reliability statistics and explicit controls for confounders are not reported in our manuscript. We will add a dedicated paragraph in the Datasets section discussing these limitations, the controlled collection conditions of each dataset, and how the labels target exercise quality. No new inter-rater study is feasible without access to the original annotators, but the added discussion will clarify the scope of our claims. revision: yes

  2. Referee: [Results] Results section / tables: the abstract asserts quantitative outperformance, yet the provided manuscript excerpt contains no numerical metrics, baseline comparisons, ablation studies, or error analysis. Without these load-bearing elements, the superiority and practical-relevance claims cannot be evaluated.

    Authors: The full manuscript contains a complete Results section with quantitative metrics (accuracy, MAE, etc.), baseline comparisons on all three datasets, ablation studies on axial attention and curve aggregation, and error analysis. The excerpt supplied to the referee appears to have omitted these sections. We will ensure the revised submission presents all tables and figures immediately after the method description for clarity. revision: no

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper presents a standard supervised ML pipeline: a transformer model is trained on labeled point-cloud data from three public datasets (Kimore, UI-PRMD, IRDS) whose quality scores are supplied by external expert annotations. No equations, fitted parameters, or self-citations are shown that would make any reported performance metric or generalization claim equivalent to the training inputs by construction. The central claims (feature extraction via axial attention and curve aggregation, outperformance on held-out test splits) rest on empirical evaluation rather than definitional or self-referential reduction. This is the normal, non-circular case for a data-driven computer-vision paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on standard assumptions of supervised learning on annotated RGBD joint data and the representativeness of the three named datasets; no explicit free parameters, axioms, or invented entities are described in the abstract.

pith-pipeline@v0.9.1-grok · 5774 in / 1116 out tokens · 25450 ms · 2026-06-30T06:53:23.228334+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

85 extracted references · 5 canonical work pages · 3 internal anchors

  1. [1]

    Figure 1c shows st-gcns that are unable to map different topologies

    have also used modified GCNs with transformers and self-attention, but their performance was limited by the lack of ST-GCNs. Figure 1c shows st-gcns that are unable to map different topologies. In this study, we investigate the use of transformer models that utilize curve-based point cloud analysis [23] with axial attention to perform a regression task on...

  2. [2]

    Curve Grouping 2b

    Overall Architecture of the Attention Based Transformer Model Linear Linear GELU Dropout Dropout MLP and Top-K Calculate Neighbors Curve Descriptor Curve grouping KNN Attentive Pooling x x Downsample R Chunk Linear Normalize xx + POE 2a. Curve Grouping 2b. Curve Aggregation 2c. Attention Block 2d. Feed Forward Fig. 2: (1)Architecture Overview:InputXis pas...

  3. [3]

    Spine Mid 3

    Spine Base 2. Spine Mid 3. Neck

  4. [4]

    Shoulder Right 6

    Head 5. Shoulder Right 6. Elbow Right

  5. [5]

    Hand Right 9

    Wrist Right 8. Hand Right 9. Shoulder Left

  6. [6]

    Wrist Left 12

    Elbow Left 11. Wrist Left 12. Hand Left

  7. [7]

    Knee Left 15

    Hip Left 14. Knee Left 15. Ankle Left

  8. [8]

    Hip Right 18

    Foot Left 17. Hip Right 18. Knee Right

  9. [9]

    Foot Right 21

    Ankle Right 20. Foot Right 21. Spine Shoulder

  10. [10]

    Thumb Right 24

    Tip Right 23. Thumb Right 24. Tip Left

  11. [11]

    5: An illustration of positive integrated gradients calculated on each user joint for each exercise

    Thumb Left 22 23 13 14 1516 9 10 11 12 24 25 17 18 19 20 Example 21 Fig. 5: An illustration of positive integrated gradients calculated on each user joint for each exercise. The gradients are shown as joint size on the 2D skeleton diagrams. A larger joint showcases a higher attribution. We showcase different levels of accuracy for each type of exercise. W...

  12. [12]

    Quality and quantity of rehabilitation exercises delivered by a 3-d motion controlled camera: A pilot study,

    R. Komatireddy, A. Chokshi, J. Basnett, M. Casale, D. Goble, and T. Shubert, “Quality and quantity of rehabilitation exercises delivered by a 3-d motion controlled camera: A pilot study,”International Journal of Physical Medicine & Rehabilitation, vol. 2, no. 4, 2014

  13. [13]

    Determinants of utilization and expenditures for episodes of ambulatory physical therapy among adults,

    S. R. Machlin, J. Chevan, W. W. Yu, and M. W. Zodet, “Determinants of utilization and expenditures for episodes of ambulatory physical therapy among adults,”Physical Therapy, vol. 91, no. 7, pp. 1018–1029, 2011

  14. [14]

    Space-time representation of people based on 3d skeletal data: A review,

    F. Han, B. Reily, W. Hoff, and H. Zhang, “Space-time representation of people based on 3d skeletal data: A review,”Computer Vision and Image Understanding, vol. 158, pp. 85–105, 2017

  15. [15]

    Discriminative orderlet mining for real-time recognition of human-object interaction,

    G. Yu, Z. Liu, and J. Yuan, “Discriminative orderlet mining for real-time recognition of human-object interaction,” inComputer Vision–ACCV 2014: 12th Asian Conference on Computer Vision, Singapore, Singapore, November 1-5, 2014, Revised Selected Papers, Part V 12, pp. 50–65, Springer, 2015

  16. [16]

    Transition forests: Learning dis- criminative temporal transitions for action recognition and detection,

    G. Garcia-Hernando and T.-K. Kim, “Transition forests: Learning dis- criminative temporal transitions for action recognition and detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 432–440, 2017

  17. [17]

    A deep learning framework for assessing physical rehabilitation exercises,

    Y . Liao, A. Vakanski, and M. Xian, “A deep learning framework for assessing physical rehabilitation exercises,”IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 28, no. 2, pp. 468– 477, 2020

  18. [18]

    Graph convolutional networks for assessment of physical rehabilitation exercises,

    S. Deb, M. F. Islam, S. Rahman, and S. Rahman, “Graph convolutional networks for assessment of physical rehabilitation exercises,”IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 30, pp. 410–419, 2022

  19. [19]

    Improved spatial–temporal graph convolutional networks for upper limb rehabilitation assessment based on precise posture measurement,

    J. Bai, Z. Wang, X. Lu, and X. Wen, “Improved spatial–temporal graph convolutional networks for upper limb rehabilitation assessment based on precise posture measurement,”Frontiers in Neuroscience, vol. 17, 2023

  20. [20]

    The community balance and mobility scale-a balance measure for individuals with traumatic brain injury,

    J. Howe, E. Inness, A. Venturini, J. Williams, and M. Verrier, “The community balance and mobility scale-a balance measure for individuals with traumatic brain injury,”Clinical Rehabilitation, vol. 20, no. 10, pp. 885–895, 2006

  21. [21]

    Support vector machine-based classifier for the assessment of finger movement of stroke patients undergoing rehabilitation,

    T. Hamaguchi, T. Saito, M. Suzuki, T. Ishioka, Y . Tomisawa, N. Nakaya, and M. Abo, “Support vector machine-based classifier for the assessment of finger movement of stroke patients undergoing rehabilitation,”Journal of Medical and Biological Engineering, vol. 40, pp. 91–100, 2020

  22. [22]

    Automatic recognition of gait-related health problems in the elderly using machine learning,

    B. Pogorelc, Z. Bosni ´c, and M. Gams, “Automatic recognition of gait-related health problems in the elderly using machine learning,” Multimedia tools and applications, vol. 58, pp. 333–354, 2012

  23. [23]

    Spatial temporal graph convolutional networks for skeleton-based action recognition,

    S. Yan, Y . Xiong, and D. Lin, “Spatial temporal graph convolutional networks for skeleton-based action recognition,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 32, 2018

  24. [24]

    A survey on robotic devices for upper limb rehabili- tation,

    P. Maciejasz, J. Eschweiler, K. Gerlach-Hahn, A. Jansen-Troy, and S. Leonhardt, “A survey on robotic devices for upper limb rehabili- tation,”Journal of NeuroEngineering and Rehabilitation, vol. 11, no. 1, pp. 1–29, 2014

  25. [25]

    Virtual reality and haptics as an assessment device in the postacute phase after stroke,

    J. Broeren, A. Bj ¨orkdahl, R. Pascher, and M. Rydmark, “Virtual reality and haptics as an assessment device in the postacute phase after stroke,” CyberPsychology & Behavior, vol. 5, no. 3, pp. 207–211, 2002

  26. [26]

    L. V . Gauthier, C. Kane, A. Borstad, N. Strahl, G. Uswatte, E. Taub, D. Morris, A. Hall, M. Arakelian, and V . Mark, “Video game reha- bilitation for outpatient stroke (vigorous): protocol for a multi-center comparative effectiveness trial of in-home gamified constraint-induced movement therapy for rehabilitation of chronic upper extremity hemi- paresis,...

  27. [27]

    Microsoft kinect sensor and its effect,

    Z. Zhang, “Microsoft kinect sensor and its effect,”IEEE multimedia, vol. 19, no. 2, pp. 4–10, 2012

  28. [28]

    Richly activated graph convolutional network for robust skeleton-based action recognition,

    Y .-F. Song, Z. Zhang, C. Shan, and L. Wang, “Richly activated graph convolutional network for robust skeleton-based action recognition,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 5, pp. 1915–1925, 2020

  29. [29]

    When human pose estimation meets robustness: Adversarial algorithms and benchmarks,

    J. Wang, S. Jin, W. Liu, W. Liu, C. Qian, and P. Luo, “When human pose estimation meets robustness: Adversarial algorithms and benchmarks,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11855–11864, 2021

  30. [30]

    Two-stream adaptive graph convolutional networks for skeleton-based action recognition,

    L. Shi, Y . Zhang, J. Cheng, and H. Lu, “Two-stream adaptive graph convolutional networks for skeleton-based action recognition,” inPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12026–12035, 2019

  31. [31]

    D-stgcnt: A dense spatio-temporal graph conv-gru network based on transformer for assessment of patient physical rehabilitation,

    Y . Mourchid and R. Slama, “D-stgcnt: A dense spatio-temporal graph conv-gru network based on transformer for assessment of patient physical rehabilitation,”Computers in Biology and Medicine, vol. 165, p. 107420, 2023

  32. [32]

    Mr-stgn: Multi-residual spatio temporal graph network using attention fusion for patient action assessment,

    Y . Mourchid and R. Slama, “Mr-stgn: Multi-residual spatio temporal graph network using attention fusion for patient action assessment,” in2023 IEEE 25th International Workshop on Multimedia Signal Processing (MMSP), pp. 1–6, IEEE, 2023

  33. [33]

    Skeleton-based action recog- nition with multi-stream adaptive graph convolutional networks,

    L. Shi, Y . Zhang, J. Cheng, and H. Lu, “Skeleton-based action recog- nition with multi-stream adaptive graph convolutional networks,”IEEE Transactions on Image Processing, vol. 29, pp. 9532–9545, 2020

  34. [34]

    Walk in the cloud: Learning curves for point clouds shape analysis,

    T. Xiang, C. Zhang, Y . Song, J. Yu, and W. Cai, “Walk in the cloud: Learning curves for point clouds shape analysis,” inProceedings of the IEEE/CVF International Conference on Computer Vision, pp. 915–924, 2021

  35. [35]

    A review of human activity recognition methods,

    M. Vrigkas, C. Nikou, and I. A. Kakadiaris, “A review of human activity recognition methods,”Frontiers in Robotics and AI, vol. 2, p. 28, 2015

  36. [36]

    Sensor- based activity recognition,

    L. Chen, J. Hoey, C. D. Nugent, D. J. Cook, and Z. Yu, “Sensor- based activity recognition,”IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 42, no. 6, pp. 790– 808, 2012

  37. [37]

    Activity recogni- tion from accelerometer data,

    N. Ravi, N. Dandekar, P. Mysore, and M. L. Littman, “Activity recogni- tion from accelerometer data,” inAaai, vol. 5, pp. 1541–1546, Pittsburgh, PA, 2005

  38. [38]

    Clinical feasibility of interactive motion-controlled games for stroke rehabilitation,

    K. J. Bower, J. Louie, Y . Landesrocha, P. Seedy, A. Gorelik, and J. Bern- hardt, “Clinical feasibility of interactive motion-controlled games for stroke rehabilitation,”Journal of NeuroEngineering and Rehabilitation, vol. 12, no. 1, pp. 1–12, 2015

  39. [39]

    Less is more: Facial landmarks can recognize a spontaneous smile,

    M. Tahrim Faroque, Y . Yang, M. Zakir Hossain, S. Motahar Naim, N. Mohammed, and S. Rahman, “Less is more: Facial landmarks can recognize a spontaneous smile,”arXiv e-prints, pp. arXiv–2210, 2022

  40. [40]

    View invariant human action recognition using histograms of 3d joints,

    L. Xia, C.-C. Chen, and J. K. Aggarwal, “View invariant human action recognition using histograms of 3d joints,” in2012 IEEE Computer Soci- ety Conference on Computer Vision and Pattern Recognition Workshops, pp. 20–27, IEEE, 2012

  41. [41]

    Sequential deep learning for human action recognition,

    M. Baccouche, F. Mamalet, C. Wolf, C. Garcia, and A. Baskurt, “Sequential deep learning for human action recognition,” inHuman Behavior Understanding: Second International Workshop, HBU 2011, Amsterdam, The Netherlands, November 16, 2011. Proceedings 2, pp. 29–39, Springer, 2011

  42. [42]

    Blstm-rnn based 3d gesture classification,

    G. Lefebvre, S. Berlemont, F. Mamalet, and C. Garcia, “Blstm-rnn based 3d gesture classification,” inArtificial Neural Networks and Machine Learning–ICANN 2013: 23rd International Conference on Artificial Neural Networks Sofia, Bulgaria, September 10-13, 2013. Proceedings 23, pp. 381–388, Springer, 2013

  43. [43]

    Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition,

    F. J. Ord ´o˜nez and D. Roggen, “Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition,”Sensors, vol. 16, no. 1, p. 115, 2016

  44. [44]

    Structural-rnn: Deep learning on spatio-temporal graphs,

    A. Jain, A. R. Zamir, S. Savarese, and A. Saxena, “Structural-rnn: Deep learning on spatio-temporal graphs,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5308– 5317, 2016

  45. [45]

    Ai-driven stroke rehabilitation systems and assessment: A systematic review,

    S. Rahman, S. Sarker, A. N. Haque, M. M. Uttsha, M. F. Islam, and S. Deb, “Ai-driven stroke rehabilitation systems and assessment: A systematic review,”IEEE Transactions on Neural Systems and Reha- bilitation Engineering, 2022. A POINT CLOUD TRANSFORMER FOR REMOTE MONITORING AND AUTOMATED ASSESSMENT OF PHYSICAL REHABILITATION EXERCISES 13

  46. [46]

    Comparative abilities of microsoft kinect and vicon 3d motion capture for gait analysis,

    A. Pfister, A. M. West, S. Bronner, and J. A. Noah, “Comparative abilities of microsoft kinect and vicon 3d motion capture for gait analysis,”Journal of Medical Engineering & Technology, vol. 38, no. 5, pp. 274–280, 2014

  47. [47]

    Quantitative measurement of motor symptoms in parkinson’s disease: A study with full-body motion capture data,

    S. Das, L. Trutoiu, A. Murai, D. Alcindor, M. Oh, F. De la Torre, and J. Hodgins, “Quantitative measurement of motor symptoms in parkinson’s disease: A study with full-body motion capture data,” in 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 6789–6792, IEEE, 2011

  48. [48]

    Evolution of accelerometer methods for physical activity research,

    R. P. Troiano, J. J. McClain, R. J. Brychta, and K. Y . Chen, “Evolution of accelerometer methods for physical activity research,”British Journal of Sports Medicine, vol. 48, no. 13, pp. 1019–1023, 2014

  49. [49]

    Optimized assessment of physical rehabilitation exercises using spatiotemporal, sequential graph- convolutional networks,

    I. Kourbane, P. Papadakis, and M. Andries, “Optimized assessment of physical rehabilitation exercises using spatiotemporal, sequential graph- convolutional networks,”Computers in Biology and Medicine, vol. 186, p. 109578, 2025

  50. [50]

    Shoulder physiotherapy exercise recognition: machine learning the inertial signals from a smartwatch,

    D. M. Burns, N. Leung, M. Hardisty, C. M. Whyne, P. Henry, and S. McLachlin, “Shoulder physiotherapy exercise recognition: machine learning the inertial signals from a smartwatch,”Physiological measure- ment, vol. 39, no. 7, p. 075007, 2018

  51. [51]

    Template matching based motion classification for unsupervised post-stroke rehabilitation,

    Z. Zhang, Q. Fang, L. Wang, and P. Barrett, “Template matching based motion classification for unsupervised post-stroke rehabilitation,” in International Symposium on Bioelectronics and Bioinformations 2011, pp. 199–202, IEEE, 2011

  52. [52]

    Imu-based solution for automatic detection and clas- sification of exercises in the fitness scenario,

    C. Crema, A. Depari, A. Flammini, E. Sisinni, T. Haslwanter, and S. Salzmann, “Imu-based solution for automatic detection and clas- sification of exercises in the fitness scenario,” in2017 IEEE Sensors Applications Symposium (SAS), pp. 1–6, IEEE, 2017

  53. [53]

    Automatic classification of squat posture using inertial sensors: Deep learning approach,

    J. Lee, H. Joo, J. Lee, and Y . Chee, “Automatic classification of squat posture using inertial sensors: Deep learning approach,”Sensors, vol. 20, no. 2, p. 361, 2020

  54. [54]

    Dynamic programming algorithm optimization for spoken word recognition,

    H. Sakoe and S. Chiba, “Dynamic programming algorithm optimization for spoken word recognition,”IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 26, no. 1, pp. 43–49, 1978

  55. [55]

    Movement analysis of rehabili- tation exercises: Distance metrics for measuring patient progress,

    R. Houmanfar, M. Karg, and D. Kuli ´c, “Movement analysis of rehabili- tation exercises: Distance metrics for measuring patient progress,”IEEE Systems Journal, vol. 10, no. 3, pp. 1014–1025, 2014

  56. [56]

    Improving k-nearest neighbour classification with distance functions based on receiver operating characteristics,

    M. R. Hassan, M. M. Hossain, J. Bailey, and K. Ramamohanarao, “Improving k-nearest neighbour classification with distance functions based on receiver operating characteristics,” inMachine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2008, Antwerp, Belgium, September 15-19, 2008, Proceedings, Part I 19, pp. 489–504, Springer, 2008

  57. [57]

    Kinect-enabled home-based rehabilitation system using dynamic time warping and fuzzy logic,

    C.-J. Su, C.-Y . Chiang, and J.-Y . Huang, “Kinect-enabled home-based rehabilitation system using dynamic time warping and fuzzy logic,” Applied Soft Computing, vol. 22, pp. 652–666, 2014

  58. [58]

    Objective assessment of upper-limb mobility for poststroke rehabilitation,

    Z. Zhang, Q. Fang, and X. Gu, “Objective assessment of upper-limb mobility for poststroke rehabilitation,”IEEE Transactions on Biomedical Engineering, vol. 63, no. 4, pp. 859–868, 2015

  59. [59]

    Exercise recognition for kinect-based telerehabilitation,

    D. Ant ´on, A. Goni, and A. Illarramendi, “Exercise recognition for kinect-based telerehabilitation,”Methods of Information in Medicine, vol. 54, no. 02, pp. 145–155, 2015

  60. [60]

    A deep learning system to monitor and assess rehabilitation exercises in home- based remote and unsupervised conditions,

    C. Mennella, U. Maniscalco, G. De Pietro, and M. Esposito, “A deep learning system to monitor and assess rehabilitation exercises in home- based remote and unsupervised conditions,”Computers in Biology and Medicine, vol. 166, p. 107485, 2023

  61. [61]

    Semantics- guided neural networks for efficient skeleton-based human action recog- nition,

    P. Zhang, C. Lan, W. Zeng, J. Xing, J. Xue, and N. Zheng, “Semantics- guided neural networks for efficient skeleton-based human action recog- nition,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1112–1121, 2020

  62. [62]

    Ntu rgb+ d: A large scale dataset for 3d human activity analysis,

    A. Shahroudy, J. Liu, T.-T. Ng, and G. Wang, “Ntu rgb+ d: A large scale dataset for 3d human activity analysis,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1010– 1019, 2016

  63. [63]

    Hierarchical recurrent neural network for skeleton based action recognition,

    Y . Du, W. Wang, and L. Wang, “Hierarchical recurrent neural network for skeleton based action recognition,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1110– 1118, 2015

  64. [64]

    Autonomous modeling of repetitive movement for rehabilitation exercise monitoring,

    P. Jatesiktat, G. M. Lim, C. W. K. Kuah, D. Anopas, and W. T. Ang, “Autonomous modeling of repetitive movement for rehabilitation exercise monitoring,”BMC Medical Informatics and Decision Making, vol. 22, no. 1, p. 175, 2022

  65. [65]

    Learning to assess the quality of stroke rehabilitation exercises,

    M. H. Lee, D. P. Siewiorek, A. Smailagic, A. Bernardino, and S. B. i. Badia, “Learning to assess the quality of stroke rehabilitation exercises,” inProceedings of the 24th International Conference on Intelligent User Interfaces, pp. 218–228, 2019

  66. [66]

    Dynamic graph cnn for learning on point clouds,

    Y . Wang, Y . Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M. Solomon, “Dynamic graph cnn for learning on point clouds,”ACM Transactions On Graphics (tog), vol. 38, no. 5, pp. 1–12, 2019

  67. [67]

    Non-local neural net- works,

    X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local neural net- works,” inProceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 7794–7803, 2018

  68. [68]

    Pointnet: Deep learning on point sets for 3d classification and segmentation,

    C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” inProceedings of the IEEE conference on computer vision and pattern recognition, pp. 652– 660, 2017

  69. [69]

    Pointnet++: Deep hierarchical feature learning on point sets in a metric space,

    C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,”Advances in Neural Information Processing Systems, vol. 30, 2017

  70. [70]

    Semi-Supervised Classification with Graph Convolutional Networks

    T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,”arXiv preprint arXiv:1609.02907, 2016

  71. [71]

    Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting

    B. Yu, H. Yin, and Z. Zhu, “Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting,”arXiv preprint arXiv:1709.04875, 2017

  72. [72]

    Randla-net: Efficient semantic segmentation of large- scale point clouds,

    Q. Hu, B. Yang, L. Xie, S. Rosa, Y . Guo, Z. Wang, N. Trigoni, and A. Markham, “Randla-net: Efficient semantic segmentation of large- scale point clouds,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11108–11117, 2020

  73. [73]

    Pointasnl: Robust point clouds processing using nonlocal neural networks with adaptive sampling,

    X. Yan, C. Zheng, Z. Li, S. Wang, and S. Cui, “Pointasnl: Robust point clouds processing using nonlocal neural networks with adaptive sampling,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5589–5598, 2020

  74. [74]

    Attention is all you need,

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,”Advances in Neural Information Processing Systems, vol. 30, 2017

  75. [75]

    Graph u-nets,

    H. Gao and S. Ji, “Graph u-nets,” inInternational Conference on Machine Learning, pp. 2083–2092, PMLR, 2019

  76. [76]

    Categorical Reparameterization with Gumbel-Softmax

    E. Jang, S. Gu, and B. Poole, “Categorical reparameterization with gumbel-softmax,”arXiv preprint arXiv:1611.01144, 2016

  77. [77]

    A data set of human body movements for physical rehabilitation exercises,

    A. Vakanski, H.-p. Jun, D. Paul, and R. Baker, “A data set of human body movements for physical rehabilitation exercises,”Data, vol. 3, no. 1, p. 2, 2018

  78. [78]

    The kimore dataset: Kinematic assessment of movement and clinical scores for remote monitoring of physical rehabilitation,

    M. Capecci, M. G. Ceravolo, F. Ferracuti, S. Iarlori, A. Monteriu, L. Romeo, and F. Verdini, “The kimore dataset: Kinematic assessment of movement and clinical scores for remote monitoring of physical rehabilitation,”IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 27, no. 7, pp. 1436–1448, 2019

  79. [79]

    Intel- lirehabds (irds)—a dataset of physical rehabilitation movements,

    A. Miron, N. Sadawi, W. Ismail, H. Hussain, and C. Grosan, “Intel- lirehabds (irds)—a dataset of physical rehabilitation movements,”Data, vol. 6, no. 5, p. 46, 2021

  80. [80]

    Axial-deeplab: Stand-alone axial-attention for panoptic segmentation,

    H. Wang, Y . Zhu, B. Green, H. Adam, A. Yuille, and L.-C. Chen, “Axial-deeplab: Stand-alone axial-attention for panoptic segmentation,” inEuropean Conference on Computer Vision, pp. 108–126, Springer, 2020

Showing first 80 references.