pith. sign in

arxiv: 2606.07431 · v2 · pith:LRFRQ7D7new · submitted 2026-06-05 · 💻 cs.CV

OpenGlass: Ultra-Low-Power On-Device AI Eyewear with Event-based Vision

Pith reviewed 2026-06-27 22:01 UTC · model grok-4.3

classification 💻 cs.CV
keywords smart glassesevent-based visionon-device machine learninglow-power systemswearable computinggesture recognitionpower managementRISC-V
0
0 comments X

The pith

OpenGlass achieves up to 11.5 hours of continuous on-device machine learning in eyewear from a 200 mAh battery via event-driven power management.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an open-source smart glasses platform that integrates event-based vision and embedded machine learning under tight power and size limits. It centers on a hardware-software co-design that uses a coordinator chip to wake the main processor only when events occur, leaving it powered down otherwise. This yields the reported battery runtime while supporting a modular camera interface. A hand gesture recognition demonstration shows the system handling real inference workloads at usable accuracy and speed. A sympathetic reader would care because the approach addresses the core barrier of short battery life that has limited always-on AI in compact wearables.

Core claim

The platform employs a flexible FPC interposer for camera modularity and a co-designed power system with a configurable PMIC plus nRF5340 coordinator for event-driven wake-up. This architecture keeps the GAP9 RISC-V SoC powered down between inferences. The resulting prototype delivers up to 11.5 hours of continuous on-device ML from a 200 mAh battery. In the LynX dataset evaluation of egocentric hand gesture recognition using polarity-separated event histograms, an R(2+1)D model reaches 83.94 percent cross-subject accuracy and 78.3 ms end-to-end latency on the GAP9.

What carries the argument

The event-driven wake-up mechanism via the nRF5340 coordinator that activates the GAP9 RISC-V SoC only when relevant events are detected.

If this is right

  • On-device ML workloads become feasible for extended periods without recharging in compact wearable form factors.
  • Camera integration can switch between event-based and frame-based sensors without requiring a full hardware redesign.
  • Open release of designs, firmware, and models lowers the barrier for others to prototype new sensor-algorithm combinations.
  • Low-latency inference pipelines support interactive applications such as real-time gesture control.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same wake-up strategy could be adapted to other small battery-powered devices that need occasional AI processing.
  • Event-driven activation may cut average power more than frame-based sampling in vision-heavy wearables.
  • Community extensions of the open platform could add non-vision sensors for broader context awareness.

Load-bearing premise

The coordinator chip detects relevant events accurately enough to wake the main processor without missing key inputs or adding enough overhead to erase the battery-life gains.

What would settle it

A side-by-side measurement of actual battery drain during continuous event-driven gesture recognition that falls substantially below the claimed 11.5 hours.

Figures

Figures reproduced from arXiv: 2606.07431 by Ahmet Celik, Julian Moosmann, Michele Magno, Philipp Mayer, Pietro Bonazzi.

Figure 1
Figure 1. Figure 1: Overview of the OpenGlass platform. The system integrates multi [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Annotated photographs of the fabricated hardware. Left: the Main Board PCB (portrait view) with the FPC AUX attached, showing the GAP9 ML [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: Block diagram of the proposed smart eyewear platform. The efficiency [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Row-normalised confusion matrix for the best model (R(2+1)D, [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Representative inference examples from the test set (subjects 8 and [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
read the original abstract

Smart eyewear enables unobtrusive, context-aware interaction through multimodal sensors and on-device intelligence, but is severely limited by power, memory, and compute constraints in a compact form factor. Open-hardware platforms supporting event-based vision and embedded ML at this scale are rare. This work introduces an open-source smart glasses platform for rapid prototyping of novel sensors and algorithms. Its modular design uses a flexible FPC interposer to support both event-based and frame-based cameras without full PCB redesign. A hardware-software co-designed power management system combines a configurable PMIC with event-driven wake-up via an nRF5340 coordinator, keeping the GAP9 RISC-V SoC powered down between inferences. The prototype achieves up to 11.5 hours of continuous on-device ML from a 200 mAh battery. As a demonstration, an egocentric hand gesture recognition pipeline was evaluated on the LynX dataset using polarity-separated event histograms from a Prophesee GENX320 camera. R(2+1)D achieved the best cross-subject accuracy of 83.94\% (macro F1 = 0.781) under leave-two-subjects-out validation, with 78.3 ms end-to-end inference latency on the GAP9. Temporal augmentation and removal of ambiguous classes provided the largest gains (+8.9 pp). All hardware designs, firmware, and models are released open source.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces OpenGlass, an open-source smart glasses platform for event-based vision and on-device ML. It describes a modular FPC interposer design supporting event- and frame-based cameras, a hardware-software co-designed power management system using a configurable PMIC and nRF5340 coordinator for event-driven wake-up (keeping the GAP9 RISC-V SoC powered down between inferences), and reports up to 11.5 hours continuous operation on a 200 mAh battery. A hand-gesture recognition demonstration on the LynX dataset using polarity-separated event histograms achieves 83.94% cross-subject accuracy (R(2+1)D, macro F1=0.781) under leave-two-subjects-out validation with 78.3 ms end-to-end latency on GAP9; temporal augmentation yields +8.9 pp gain. All hardware, firmware, and models are released open source.

Significance. If the power-management measurements hold, the work supplies a valuable open hardware platform addressing severe power and form-factor constraints for wearable context-aware AI, a domain with few existing open event-vision solutions. The explicit open-source release of complete hardware designs, firmware, and trained models is a concrete strength that directly supports reproducibility and extension by the community.

major comments (2)
  1. [Abstract / Power Management] Abstract and power-management description: The headline claim of 11.5 h battery life from a 200 mAh cell rests on the nRF5340 event-driven wake-up successfully detecting relevant events and activating the GAP9 only when needed with low overhead. No quantitative data (average power, wake-up frequency, nRF5340 consumption, or measured duty cycle) are supplied to substantiate this assumption, leaving the central hardware result without direct empirical support.
  2. [Evaluation / Results] Gesture-recognition evaluation: The reported 83.94 % accuracy and +8.9 pp gain from temporal augmentation are given for leave-two-subjects-out validation, yet the manuscript provides neither error bars, per-fold statistics, nor a comparison table against alternative models or ablations, making it impossible to assess whether the cross-subject claim is robust.
minor comments (2)
  1. The abstract states that designs are released open source but does not include an explicit repository URL or DOI in the main text; adding this reference would improve accessibility.
  2. [Gesture Recognition Pipeline] Notation for the event histogram construction (polarity-separated) is described only at high level; a short equation or pseudocode block would clarify the input representation to the R(2+1)D model.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below.

read point-by-point responses
  1. Referee: [Abstract / Power Management] Abstract and power-management description: The headline claim of 11.5 h battery life from a 200 mAh cell rests on the nRF5340 event-driven wake-up successfully detecting relevant events and activating the GAP9 only when needed with low overhead. No quantitative data (average power, wake-up frequency, nRF5340 consumption, or measured duty cycle) are supplied to substantiate this assumption, leaving the central hardware result without direct empirical support.

    Authors: We agree that the power-management claim requires supporting quantitative data. The manuscript will be revised to include a new table or subsection with measured average power of the nRF5340 in event-driven mode, observed wake-up frequency during the LynX demonstration, nRF5340 consumption figures, and the resulting GAP9 duty cycle that yields the reported 11.5 h runtime on the 200 mAh cell. revision: yes

  2. Referee: [Evaluation / Results] Gesture-recognition evaluation: The reported 83.94 % accuracy and +8.9 pp gain from temporal augmentation are given for leave-two-subjects-out validation, yet the manuscript provides neither error bars, per-fold statistics, nor a comparison table against alternative models or ablations, making it impossible to assess whether the cross-subject claim is robust.

    Authors: We agree that the evaluation would be more robust with additional statistics. The revised manuscript will add per-fold accuracy values with mean and standard deviation, error bars on the reported figures, and a comparison table including alternative models and ablations of the temporal augmentation and class-removal steps. revision: yes

Circularity Check

0 steps flagged

No circularity; claims rest on direct hardware measurements and external dataset evaluation

full rationale

The paper reports the 11.5-hour battery life as a measured outcome of the prototype under the described PMIC + nRF5340 architecture, and the 83.94% accuracy as evaluated on the external LynX dataset with R(2+1)D. No equations, fitted parameters, or self-citations are used to derive these quantities from themselves. The architecture description does not contain a derivation chain that reduces the headline results to inputs by construction. This is a standard empirical hardware/ML prototype paper with no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The platform rests on standard assumptions about event-camera power savings and coordinator-chip wake-up behavior rather than new physical postulates or fitted constants.

axioms (1)
  • domain assumption Event-driven wake-up via a separate low-power coordinator can keep the main SoC powered down between inferences without eroding claimed battery life
    Invoked in the description of the power management system.

pith-pipeline@v0.9.1-grok · 5790 in / 1311 out tokens · 30485 ms · 2026-06-27T22:01:10.841828+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

48 extracted references · 3 linked inside Pith

  1. [1]

    Intelligent assistants strengthening personhood,

    K. J. Hole, “Intelligent assistants strengthening personhood,”IEEE Computer, 2025

  2. [2]

    Imaging for all-day wearable smart glasses,

    M. Goesele, D. Andersen, Y . Chen, S. Green, E. Ilg, C. Li, J. Liu, G. Kuo, L. Wan, and R. Newcombe, “Imaging for all-day wearable smart glasses,”arXiv, arXiv:2504.13060, 2025

  3. [3]

    Automatic gaze analysis: A survey of deep learning based approaches,

    S. Ghosh, A. Dhall, M. Hayat, J. Knibbe, and Q. Ji, “Automatic gaze analysis: A survey of deep learning based approaches,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024

  4. [4]

    Deep learn- ing for facial expression and human activity recognition using smart glasses,

    M. Marinova, E. Chona, A. Kotevski, B. Sazdov, I. Kiprijanovska, S. Stankoski, M. Gjoreski, C. Nduka, and H. Gjoreski, “Deep learn- ing for facial expression and human activity recognition using smart glasses,”IEEE Access, 2025

  5. [5]

    GAPses: Versatile smart glasses for comfortable and fully-dry acquisition and parallel ultra-low-power processing of EEG and EOG,

    S. Frey, M. A. Lucchini, V . Kartsch, T. M. Ingolfsson, A. H. Bernardi, M. Segessenmann, J. Osieleniec, S. Benatti, L. Benini, and A. Cos- settini, “GAPses: Versatile smart glasses for comfortable and fully-dry acquisition and parallel ultra-low-power processing of EEG and EOG,” IEEE Transactions on Biomedical Circuits and Systems, 2025

  6. [6]

    ElectraSight: Smart glasses with fully onboard non- invasive eye tracking using hybrid contact and contactless EOG,

    N. Sch ¨arer, F. Villani, A. Melatur, S. Peter, T. Polonelli, and M. Magno, “ElectraSight: Smart glasses with fully onboard non- invasive eye tracking using hybrid contact and contactless EOG,”arXiv, arXiv:2412.14848, 2024

  7. [7]

    mmet: mmwave radar-based eye tracking on smart glasses,

    R. Ma, Y . Morimoto, J. S. Ho, S. Shiu, and J. Zhu, “mmet: mmwave radar-based eye tracking on smart glasses,”ACM CHI Conference on Human Factors in Computing Systems, 2025

  8. [8]

    Applications of terahertz spectroscopy in the detection and recognition of substances,

    X. Fu, Y . Liu, Q. Chen, Y . Fu, and T. J. Cui, “Applications of terahertz spectroscopy in the detection and recognition of substances,”Frontiers in Physics, 2022

  9. [9]

    Seven HCI grand challenges revisited: Five-year progress,

    C. Stephanidis, G. Salvendy, M. Antona, V . G. Duffy, Q. Gao, W. Kar- wowski, S. Konomi, F. Nah, S. Ntoa, P.-L. P. Rau, K. Siau, and J. Zhou, “Seven HCI grand challenges revisited: Five-year progress,” International Journal of Human–Computer Interaction, 2025

  10. [10]

    Meta smart glasses—large language models and the future for assistive glasses for individuals with vision impairments,

    E. Waisberg, J. Ong, M. Masalkhi, N. Zaman, P. Sarker, A. G. Lee, and A. Tavakkoli, “Meta smart glasses—large language models and the future for assistive glasses for individuals with vision impairments,”Eye, 2024

  11. [11]

    EmBARDiment: An embodied AI agent for productivity in XR,

    R. Bovo, S. Abreu, K. Ahuja, E. J. Gonzalez, L.-T. Cheng, and M. Gonzalez-Franco, “EmBARDiment: An embodied AI agent for productivity in XR,”IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2025

  12. [12]

    A systematic literature review on integrating AI-powered smart glasses into digital health management for proactive healthcare solutions,

    B. Wang, Y . Zheng, X. Han, L. Kong, G. Xiao, Z. Xiao, and S. Chen, “A systematic literature review on integrating AI-powered smart glasses into digital health management for proactive healthcare solutions,”npj Digital Medicine, 2025

  13. [13]

    Toward attention-based TinyML: A heteroge- neous accelerated architecture and automated deployment flow,

    P. Wiese, G. ˙Islamo˘glu, M. Scherer, L. Macan, V . J. Jung, A. Burrello, F. Conti, and L. Benini, “Toward attention-based TinyML: A heteroge- neous accelerated architecture and automated deployment flow,”IEEE Design & Test, 2025

  14. [14]

    GAP-8: A RISC-V SoC for AI at the edge of the IoT,

    E. Flamand, D. Rossi, F. Conti, I. Loi, A. Pullini, F. Rotenberg, and L. Benini, “GAP-8: A RISC-V SoC for AI at the edge of the IoT,”IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP), 2018

  15. [15]

    PULP- NN: Accelerating quantized neural networks on parallel ultra-low-power RISC-V processors,

    A. Garofalo, M. Rusci, F. Conti, D. Rossi, and L. Benini, “PULP- NN: Accelerating quantized neural networks on parallel ultra-low-power RISC-V processors,”Philosophical Transactions of the Royal Society A, 2020

  16. [16]

    Memory-driven mixed low precision quantization for enabling deep network inference on micro- controllers,

    M. Rusci, A. Capotondi, and L. Benini, “Memory-driven mixed low precision quantization for enabling deep network inference on micro- controllers,”arXiv, arXiv:1905.13082, 2020

  17. [17]

    MLPerf tiny benchmark,

    C. Banbury, V . J. Reddi, M. Lam, W. Fu, A. Fazel, J. Holleman, X. Huang, R. Hurtado, D. Kanter, A. Lokhmotov, D. Patterson, D. Pau, J.-s. Seo, J. Sieracki, U. Thakker, M. Verhelst, and P. Yadav, “MLPerf tiny benchmark,”arXiv, arXiv:2106.07597, 2021

  18. [18]

    MCUNet: Tiny deep learning on IoT devices,

    J. Lin, W.-M. Chen, Y . Lin, J. Cohn, C. Gan, and S. Han, “MCUNet: Tiny deep learning on IoT devices,”arXiv, arXiv:2007.10319, 2020

  19. [19]

    TensorFlow Lite Micro: Embedded machine learning on TinyML systems,

    R. David, J. Duke, A. Jain, V . Janapa Reddi, N. Jeffries, J. Li, N. Kreeger, I. Nappier, M. Natraj, T. Wang, and P. Warden, “TensorFlow Lite Micro: Embedded machine learning on TinyML systems,”arXiv, arXiv:2010.08678, 2021

  20. [20]

    Event-based vision: A survey,

    G. Gallego, T. Delbr ¨uck, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis, and D. Scara- muzza, “Event-based vision: A survey,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022

  21. [21]

    Recent event camera innovations: A survey,

    B. Chakravarthi, A. A. Verma, K. Daniilidis, C. Fermuller, and Y . Yang, “Recent event camera innovations: A survey,”arXiv, arXiv:2408.13627, 2024

  22. [22]

    A 128×128 120 db 15µs latency asynchronous temporal contrast vision sensor,

    P. Lichtsteiner, C. Posch, and T. Delbruck, “A 128×128 120 db 15µs latency asynchronous temporal contrast vision sensor,”IEEE Journal of Solid-State Circuits, 2008

  23. [23]

    LynX: An event-based gesture dataset for egocentric inter- action in extended reality,

    P. Bartoli, V . Jayaprakash, J. Moosmann, P. Mayer, F. Zappa, and M. Magno, “LynX: An event-based gesture dataset for egocentric inter- action in extended reality,”IEEE International Workshop on Advances in Sensors and Interfaces (IWASI), 2025

  24. [24]

    Ray-ban meta wayfarer gen 2 — ai glasses,

    Meta, “Ray-ban meta wayfarer gen 2 — ai glasses,” accessed: 2025-10- 09

  25. [25]

    Hololens 2 hardware,

    Microsoft, “Hololens 2 hardware,” 2023, accessed: 2025-10-09

  26. [26]

    Frame hardware documentation,

    Brilliant Labs, “Frame hardware documentation,” 2025, accessed: 2025- 10-09

  27. [27]

    Opensourcesmartglasses,

    Mentra-Community, “Opensourcesmartglasses,” 2023, accessed: 2025- 10-09

  28. [28]

    Ultra-efficient on-device object detection on AI-integrated smart glasses with TinyissimoYOLO,

    J. Moosmann, P. Bonazzi, Y . Li, S. Bian, P. Mayer, L. Benini, and M. Magno, “Ultra-efficient on-device object detection on AI-integrated smart glasses with TinyissimoYOLO,”arXiv, arXiv:2311.01057, 2023

  29. [29]

    H-Watch: An open, connected platform for AI-enhanced COVID-19 infection symptoms monitoring and contact tracing,

    T. Polonelli, L. Schulthess, P. Mayer, M. Magno, and L. Benini, “H-Watch: An open, connected platform for AI-enhanced COVID-19 infection symptoms monitoring and contact tracing,”IEEE International Symposium on Circuits and Systems (ISCAS), 2021

  30. [30]

    Event- based solutions for human-centered applications: A comprehensive re- view,

    M. Adra, S. Melcarne, N. Mirabet-Herranz, and J.-L. Dugelay, “Event- based solutions for human-centered applications: A comprehensive re- view,”arXiv, arXiv:2502.18490, 2025

  31. [31]

    Scaling egocentric vision: The EPIC-Kitchens dataset,

    D. Damen, H. Doughty, G. M. Farinella, S. Fidler, A. Furnari, E. Kaza- kos, D. Moltisanti, J. Munro, T. Perrett, W. Price, and M. Wray, “Scaling egocentric vision: The EPIC-Kitchens dataset,”European Conference on Computer Vision (ECCV), 2018

  32. [32]

    E 2(GO)MOTION: Motion augmented event stream for egocentric action recognition,

    C. Plizzari, M. Planamente, G. Goletto, M. Cannici, E. Gusso, M. Mat- teucci, and B. Caputo, “E 2(GO)MOTION: Motion augmented event stream for egocentric action recognition,”IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. IEEE INTERNET OF THINGS JOURNAL, UNDER REVIEW, JUNE 2026 10

  33. [33]

    EgoEvGes- ture: Gesture recognition based on egocentric event camera,

    L. Wang, H. Shi, X. Yin, K. Yang, K. Wang, and J. Bai, “EgoEvGes- ture: Gesture recognition based on egocentric event camera,”arXiv, arXiv:2503.12419, 2025

  34. [34]

    Helios: An extremely low power event-based gesture recognition for always-on smart eyewear,

    P. Bhattacharyya, J. Mitton, R. Page, O. Morgan, B. Menzies, G. Home- wood, K. Jacobs, P. Baesso, D. Trickett, C. Mair, T. Muhonen, R. Clark, L. Berridge, R. Vigars, and I. Wallace, “Helios: An extremely low power event-based gesture recognition for always-on smart eyewear,”arXiv, arXiv:2407.05206, 2024

  35. [35]

    Helios 2.0: A robust, ultra-low power gesture recog- nition system optimised for event-sensor based wearables,

    P. Bhattacharyya, J. Mitton, R. Page, O. Morgan, O. Powell, B. Menzies, G. Homewood, K. Jacobs, P. Baesso, T. Muhonen, R. Vigars, and L. Berridge, “Helios 2.0: A robust, ultra-low power gesture recog- nition system optimised for event-sensor based wearables,”arXiv, arXiv:2503.07825, 2025

  36. [36]

    Towards real-time online egocentric action recognition on smart eyewear,

    R. Santambrogio, F. Caspani, G. Corti, F. Palermo, S. Mentasti, D. Tro- janiello, and M. Matteucci, “Towards real-time online egocentric action recognition on smart eyewear,”Image Analysis and Processing – ICIAP 2025, 2025

  37. [37]

    GAP9Shield: A 150GOPS AI- capable ultra-low power module for vision and ranging applications on nano-drones,

    H. M ¨uller, V . Kartsch, and L. Benini, “GAP9Shield: A 150GOPS AI- capable ultra-low power module for vision and ranging applications on nano-drones,”arXiv, arXiv:2407.13706, 2024

  38. [38]

    Red- MulE: A compact FP16 matrix-multiplication accelerator for adap- tive deep learning on RISC-V-based ultra-low-power SoCs,

    Y . Tortorella, L. Bertaccini, D. Rossi, L. Benini, and F. Conti, “Red- MulE: A compact FP16 matrix-multiplication accelerator for adap- tive deep learning on RISC-V-based ultra-low-power SoCs,”arXiv, arXiv:2204.11192, 2022

  39. [39]

    Siracusa: A 16 nm heterogeneous RISC-V SoC for extended reality with at-MRAM neural engine,

    A. S. Prasad, M. Scherer, F. Conti, D. Rossi, A. Di Mauro, M. Eggimann et al., “Siracusa: A 16 nm heterogeneous RISC-V SoC for extended reality with at-MRAM neural engine,”IEEE Journal of Solid-State Circuits, 2024

  40. [40]

    HATS: Histograms of averaged time surfaces for robust event-based object classification,

    A. Sironi, M. Brambilla, N. Bourdis, X. Lagorce, and R. Benosman, “HATS: Histograms of averaged time surfaces for robust event-based object classification,”IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018

  41. [41]

    Event-based sensor GENX320,

    Prophesee, “Event-based sensor GENX320,”Prophesee, 2024

  42. [42]

    A closer look at spatiotemporal convolutions for action recognition,

    D. Tran, H. Wang, L. Torresani, J. Ray, Y . LeCun, and M. Paluri, “A closer look at spatiotemporal convolutions for action recognition,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018

  43. [43]

    Learning spatiotemporal features with 3D convolutional networks,

    D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, “Learning spatiotemporal features with 3D convolutional networks,”IEEE Inter- national Conference on Computer Vision (ICCV), 2015

  44. [44]

    Quo vadis, action recognition? a new model and the Kinetics dataset,

    J. a. Carreira and A. Zisserman, “Quo vadis, action recognition? a new model and the Kinetics dataset,”IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2017

  45. [45]

    TSM: Temporal shift module for efficient video understanding,

    J. Lin, C. Gan, and S. Han, “TSM: Temporal shift module for efficient video understanding,”IEEE/CVF International Conference on Computer Vision (ICCV), 2019

  46. [46]

    An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,

    S. Bai, J. Z. Kolter, and V . Koltun, “An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,”arXiv, arXiv:1803.01271, 2018

  47. [47]

    ECG-TCN: Wearable cardiac arrhythmia detection with a temporal convolutional network,

    T. M. Ingolfsson, X. Wang, M. Hersche, A. Burrello, L. Cavigelli, and L. Benini, “ECG-TCN: Wearable cardiac arrhythmia detection with a temporal convolutional network,”IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2021

  48. [48]

    MobileNets: Efficient convolutional neural networks for mobile vision applications,

    A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient convolutional neural networks for mobile vision applications,”arXiv, arXiv:1704.04861, 2017