OpenGlass: Ultra-Low-Power On-Device AI Eyewear with Event-based Vision

Ahmet Celik; Julian Moosmann; Michele Magno; Philipp Mayer; Pietro Bonazzi

arxiv: 2606.07431 · v2 · pith:LRFRQ7D7new · submitted 2026-06-05 · 💻 cs.CV

OpenGlass: Ultra-Low-Power On-Device AI Eyewear with Event-based Vision

Pietro Bonazzi , Julian Moosmann , Ahmet Celik , Philipp Mayer , Michele Magno This is my paper

Pith reviewed 2026-06-27 22:01 UTC · model grok-4.3

classification 💻 cs.CV

keywords smart glassesevent-based visionon-device machine learninglow-power systemswearable computinggesture recognitionpower managementRISC-V

0 comments

The pith

OpenGlass achieves up to 11.5 hours of continuous on-device machine learning in eyewear from a 200 mAh battery via event-driven power management.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an open-source smart glasses platform that integrates event-based vision and embedded machine learning under tight power and size limits. It centers on a hardware-software co-design that uses a coordinator chip to wake the main processor only when events occur, leaving it powered down otherwise. This yields the reported battery runtime while supporting a modular camera interface. A hand gesture recognition demonstration shows the system handling real inference workloads at usable accuracy and speed. A sympathetic reader would care because the approach addresses the core barrier of short battery life that has limited always-on AI in compact wearables.

Core claim

The platform employs a flexible FPC interposer for camera modularity and a co-designed power system with a configurable PMIC plus nRF5340 coordinator for event-driven wake-up. This architecture keeps the GAP9 RISC-V SoC powered down between inferences. The resulting prototype delivers up to 11.5 hours of continuous on-device ML from a 200 mAh battery. In the LynX dataset evaluation of egocentric hand gesture recognition using polarity-separated event histograms, an R(2+1)D model reaches 83.94 percent cross-subject accuracy and 78.3 ms end-to-end latency on the GAP9.

What carries the argument

The event-driven wake-up mechanism via the nRF5340 coordinator that activates the GAP9 RISC-V SoC only when relevant events are detected.

If this is right

On-device ML workloads become feasible for extended periods without recharging in compact wearable form factors.
Camera integration can switch between event-based and frame-based sensors without requiring a full hardware redesign.
Open release of designs, firmware, and models lowers the barrier for others to prototype new sensor-algorithm combinations.
Low-latency inference pipelines support interactive applications such as real-time gesture control.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same wake-up strategy could be adapted to other small battery-powered devices that need occasional AI processing.
Event-driven activation may cut average power more than frame-based sampling in vision-heavy wearables.
Community extensions of the open platform could add non-vision sensors for broader context awareness.

Load-bearing premise

The coordinator chip detects relevant events accurately enough to wake the main processor without missing key inputs or adding enough overhead to erase the battery-life gains.

What would settle it

A side-by-side measurement of actual battery drain during continuous event-driven gesture recognition that falls substantially below the claimed 11.5 hours.

Figures

Figures reproduced from arXiv: 2606.07431 by Ahmet Celik, Julian Moosmann, Michele Magno, Philipp Mayer, Pietro Bonazzi.

**Figure 2.** Figure 2: Annotated photographs of the fabricated hardware. Left: the Main Board PCB (portrait view) with the FPC AUX attached, showing the GAP9 ML [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 4.** Figure 4: Block diagram of the proposed smart eyewear platform. The efficiency [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Row-normalised confusion matrix for the best model (R(2+1)D, [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Representative inference examples from the test set (subjects 8 and [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

read the original abstract

Smart eyewear enables unobtrusive, context-aware interaction through multimodal sensors and on-device intelligence, but is severely limited by power, memory, and compute constraints in a compact form factor. Open-hardware platforms supporting event-based vision and embedded ML at this scale are rare. This work introduces an open-source smart glasses platform for rapid prototyping of novel sensors and algorithms. Its modular design uses a flexible FPC interposer to support both event-based and frame-based cameras without full PCB redesign. A hardware-software co-designed power management system combines a configurable PMIC with event-driven wake-up via an nRF5340 coordinator, keeping the GAP9 RISC-V SoC powered down between inferences. The prototype achieves up to 11.5 hours of continuous on-device ML from a 200 mAh battery. As a demonstration, an egocentric hand gesture recognition pipeline was evaluated on the LynX dataset using polarity-separated event histograms from a Prophesee GENX320 camera. R(2+1)D achieved the best cross-subject accuracy of 83.94\% (macro F1 = 0.781) under leave-two-subjects-out validation, with 78.3 ms end-to-end inference latency on the GAP9. Temporal augmentation and removal of ambiguous classes provided the largest gains (+8.9 pp). All hardware designs, firmware, and models are released open source.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Open eyewear platform with useful modular hardware and open releases, but the 11.5-hour battery claim rests on unshown wake-up details.

read the letter

The paper presents an open-source smart glasses platform that supports event-based cameras alongside a low-power RISC-V SoC. The headline result is 11.5 hours of continuous on-device ML from a 200 mAh battery using a PMIC and nRF5340 coordinator for event-driven wake-up. They also show an egocentric hand gesture demo on the LynX dataset with R(2+1)D reaching 83.94% cross-subject accuracy and 78 ms latency.

What is actually new is the modular FPC interposer that lets the same board handle both event and frame cameras without a full redesign, plus the specific power-management architecture tuned to this form factor. Releasing the hardware files, firmware, and models is a clear positive for anyone who wants to iterate on compact always-on vision hardware.

The work does well by grounding the claims in direct hardware measurements rather than simulation alone and by documenting the gains from simple changes like temporal augmentation on the event histograms. The demonstration is straightforward and reproducible from the released assets.

The soft spot is the battery-life number. It depends on the nRF5340 successfully triggering the GAP9 only for relevant events with low overhead and without eroding the power budget through frequent or costly wake-ups. The abstract describes the architecture but supplies no quantitative data on average power, wake-up frequency, nRF5340 consumption, or measured duty cycle. If the full paper lacks those breakdowns, the central power result stays harder to evaluate.

This is for engineers and researchers building embedded event-vision systems who need a concrete starting platform. A reader focused on wearable hardware or open-source eyewear would get practical value from the designs and numbers. It deserves peer review because the platform contribution is real and released, even if the power section would benefit from more detailed measurements in revision.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces OpenGlass, an open-source smart glasses platform for event-based vision and on-device ML. It describes a modular FPC interposer design supporting event- and frame-based cameras, a hardware-software co-designed power management system using a configurable PMIC and nRF5340 coordinator for event-driven wake-up (keeping the GAP9 RISC-V SoC powered down between inferences), and reports up to 11.5 hours continuous operation on a 200 mAh battery. A hand-gesture recognition demonstration on the LynX dataset using polarity-separated event histograms achieves 83.94% cross-subject accuracy (R(2+1)D, macro F1=0.781) under leave-two-subjects-out validation with 78.3 ms end-to-end latency on GAP9; temporal augmentation yields +8.9 pp gain. All hardware, firmware, and models are released open source.

Significance. If the power-management measurements hold, the work supplies a valuable open hardware platform addressing severe power and form-factor constraints for wearable context-aware AI, a domain with few existing open event-vision solutions. The explicit open-source release of complete hardware designs, firmware, and trained models is a concrete strength that directly supports reproducibility and extension by the community.

major comments (2)

[Abstract / Power Management] Abstract and power-management description: The headline claim of 11.5 h battery life from a 200 mAh cell rests on the nRF5340 event-driven wake-up successfully detecting relevant events and activating the GAP9 only when needed with low overhead. No quantitative data (average power, wake-up frequency, nRF5340 consumption, or measured duty cycle) are supplied to substantiate this assumption, leaving the central hardware result without direct empirical support.
[Evaluation / Results] Gesture-recognition evaluation: The reported 83.94 % accuracy and +8.9 pp gain from temporal augmentation are given for leave-two-subjects-out validation, yet the manuscript provides neither error bars, per-fold statistics, nor a comparison table against alternative models or ablations, making it impossible to assess whether the cross-subject claim is robust.

minor comments (2)

The abstract states that designs are released open source but does not include an explicit repository URL or DOI in the main text; adding this reference would improve accessibility.
[Gesture Recognition Pipeline] Notation for the event histogram construction (polarity-separated) is described only at high level; a short equation or pseudocode block would clarify the input representation to the R(2+1)D model.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below.

read point-by-point responses

Referee: [Abstract / Power Management] Abstract and power-management description: The headline claim of 11.5 h battery life from a 200 mAh cell rests on the nRF5340 event-driven wake-up successfully detecting relevant events and activating the GAP9 only when needed with low overhead. No quantitative data (average power, wake-up frequency, nRF5340 consumption, or measured duty cycle) are supplied to substantiate this assumption, leaving the central hardware result without direct empirical support.

Authors: We agree that the power-management claim requires supporting quantitative data. The manuscript will be revised to include a new table or subsection with measured average power of the nRF5340 in event-driven mode, observed wake-up frequency during the LynX demonstration, nRF5340 consumption figures, and the resulting GAP9 duty cycle that yields the reported 11.5 h runtime on the 200 mAh cell. revision: yes
Referee: [Evaluation / Results] Gesture-recognition evaluation: The reported 83.94 % accuracy and +8.9 pp gain from temporal augmentation are given for leave-two-subjects-out validation, yet the manuscript provides neither error bars, per-fold statistics, nor a comparison table against alternative models or ablations, making it impossible to assess whether the cross-subject claim is robust.

Authors: We agree that the evaluation would be more robust with additional statistics. The revised manuscript will add per-fold accuracy values with mean and standard deviation, error bars on the reported figures, and a comparison table including alternative models and ablations of the temporal augmentation and class-removal steps. revision: yes

Circularity Check

0 steps flagged

No circularity; claims rest on direct hardware measurements and external dataset evaluation

full rationale

The paper reports the 11.5-hour battery life as a measured outcome of the prototype under the described PMIC + nRF5340 architecture, and the 83.94% accuracy as evaluated on the external LynX dataset with R(2+1)D. No equations, fitted parameters, or self-citations are used to derive these quantities from themselves. The architecture description does not contain a derivation chain that reduces the headline results to inputs by construction. This is a standard empirical hardware/ML prototype paper with no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The platform rests on standard assumptions about event-camera power savings and coordinator-chip wake-up behavior rather than new physical postulates or fitted constants.

axioms (1)

domain assumption Event-driven wake-up via a separate low-power coordinator can keep the main SoC powered down between inferences without eroding claimed battery life
Invoked in the description of the power management system.

pith-pipeline@v0.9.1-grok · 5790 in / 1311 out tokens · 30485 ms · 2026-06-27T22:01:10.841828+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

48 extracted references · 3 linked inside Pith

[1]

Intelligent assistants strengthening personhood,

K. J. Hole, “Intelligent assistants strengthening personhood,”IEEE Computer, 2025

2025
[2]

Imaging for all-day wearable smart glasses,

M. Goesele, D. Andersen, Y . Chen, S. Green, E. Ilg, C. Li, J. Liu, G. Kuo, L. Wan, and R. Newcombe, “Imaging for all-day wearable smart glasses,”arXiv, arXiv:2504.13060, 2025

arXiv 2025
[3]

Automatic gaze analysis: A survey of deep learning based approaches,

S. Ghosh, A. Dhall, M. Hayat, J. Knibbe, and Q. Ji, “Automatic gaze analysis: A survey of deep learning based approaches,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024

2024
[4]

Deep learn- ing for facial expression and human activity recognition using smart glasses,

M. Marinova, E. Chona, A. Kotevski, B. Sazdov, I. Kiprijanovska, S. Stankoski, M. Gjoreski, C. Nduka, and H. Gjoreski, “Deep learn- ing for facial expression and human activity recognition using smart glasses,”IEEE Access, 2025

2025
[5]

GAPses: Versatile smart glasses for comfortable and fully-dry acquisition and parallel ultra-low-power processing of EEG and EOG,

S. Frey, M. A. Lucchini, V . Kartsch, T. M. Ingolfsson, A. H. Bernardi, M. Segessenmann, J. Osieleniec, S. Benatti, L. Benini, and A. Cos- settini, “GAPses: Versatile smart glasses for comfortable and fully-dry acquisition and parallel ultra-low-power processing of EEG and EOG,” IEEE Transactions on Biomedical Circuits and Systems, 2025

2025
[6]

ElectraSight: Smart glasses with fully onboard non- invasive eye tracking using hybrid contact and contactless EOG,

N. Sch ¨arer, F. Villani, A. Melatur, S. Peter, T. Polonelli, and M. Magno, “ElectraSight: Smart glasses with fully onboard non- invasive eye tracking using hybrid contact and contactless EOG,”arXiv, arXiv:2412.14848, 2024

arXiv 2024
[7]

mmet: mmwave radar-based eye tracking on smart glasses,

R. Ma, Y . Morimoto, J. S. Ho, S. Shiu, and J. Zhu, “mmet: mmwave radar-based eye tracking on smart glasses,”ACM CHI Conference on Human Factors in Computing Systems, 2025

2025
[8]

Applications of terahertz spectroscopy in the detection and recognition of substances,

X. Fu, Y . Liu, Q. Chen, Y . Fu, and T. J. Cui, “Applications of terahertz spectroscopy in the detection and recognition of substances,”Frontiers in Physics, 2022

2022
[9]

Seven HCI grand challenges revisited: Five-year progress,

C. Stephanidis, G. Salvendy, M. Antona, V . G. Duffy, Q. Gao, W. Kar- wowski, S. Konomi, F. Nah, S. Ntoa, P.-L. P. Rau, K. Siau, and J. Zhou, “Seven HCI grand challenges revisited: Five-year progress,” International Journal of Human–Computer Interaction, 2025

2025
[10]

Meta smart glasses—large language models and the future for assistive glasses for individuals with vision impairments,

E. Waisberg, J. Ong, M. Masalkhi, N. Zaman, P. Sarker, A. G. Lee, and A. Tavakkoli, “Meta smart glasses—large language models and the future for assistive glasses for individuals with vision impairments,”Eye, 2024

2024
[11]

EmBARDiment: An embodied AI agent for productivity in XR,

R. Bovo, S. Abreu, K. Ahuja, E. J. Gonzalez, L.-T. Cheng, and M. Gonzalez-Franco, “EmBARDiment: An embodied AI agent for productivity in XR,”IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2025

2025
[12]

A systematic literature review on integrating AI-powered smart glasses into digital health management for proactive healthcare solutions,

B. Wang, Y . Zheng, X. Han, L. Kong, G. Xiao, Z. Xiao, and S. Chen, “A systematic literature review on integrating AI-powered smart glasses into digital health management for proactive healthcare solutions,”npj Digital Medicine, 2025

2025
[13]

Toward attention-based TinyML: A heteroge- neous accelerated architecture and automated deployment flow,

P. Wiese, G. ˙Islamo˘glu, M. Scherer, L. Macan, V . J. Jung, A. Burrello, F. Conti, and L. Benini, “Toward attention-based TinyML: A heteroge- neous accelerated architecture and automated deployment flow,”IEEE Design & Test, 2025

2025
[14]

GAP-8: A RISC-V SoC for AI at the edge of the IoT,

E. Flamand, D. Rossi, F. Conti, I. Loi, A. Pullini, F. Rotenberg, and L. Benini, “GAP-8: A RISC-V SoC for AI at the edge of the IoT,”IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP), 2018

2018
[15]

PULP- NN: Accelerating quantized neural networks on parallel ultra-low-power RISC-V processors,

A. Garofalo, M. Rusci, F. Conti, D. Rossi, and L. Benini, “PULP- NN: Accelerating quantized neural networks on parallel ultra-low-power RISC-V processors,”Philosophical Transactions of the Royal Society A, 2020

2020
[16]

Memory-driven mixed low precision quantization for enabling deep network inference on micro- controllers,

M. Rusci, A. Capotondi, and L. Benini, “Memory-driven mixed low precision quantization for enabling deep network inference on micro- controllers,”arXiv, arXiv:1905.13082, 2020

Pith/arXiv arXiv 1905
[17]

MLPerf tiny benchmark,

C. Banbury, V . J. Reddi, M. Lam, W. Fu, A. Fazel, J. Holleman, X. Huang, R. Hurtado, D. Kanter, A. Lokhmotov, D. Patterson, D. Pau, J.-s. Seo, J. Sieracki, U. Thakker, M. Verhelst, and P. Yadav, “MLPerf tiny benchmark,”arXiv, arXiv:2106.07597, 2021

arXiv 2021
[18]

MCUNet: Tiny deep learning on IoT devices,

J. Lin, W.-M. Chen, Y . Lin, J. Cohn, C. Gan, and S. Han, “MCUNet: Tiny deep learning on IoT devices,”arXiv, arXiv:2007.10319, 2020

arXiv 2007
[19]

TensorFlow Lite Micro: Embedded machine learning on TinyML systems,

R. David, J. Duke, A. Jain, V . Janapa Reddi, N. Jeffries, J. Li, N. Kreeger, I. Nappier, M. Natraj, T. Wang, and P. Warden, “TensorFlow Lite Micro: Embedded machine learning on TinyML systems,”arXiv, arXiv:2010.08678, 2021

arXiv 2010
[20]

Event-based vision: A survey,

G. Gallego, T. Delbr ¨uck, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis, and D. Scara- muzza, “Event-based vision: A survey,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022

2022
[21]

Recent event camera innovations: A survey,

B. Chakravarthi, A. A. Verma, K. Daniilidis, C. Fermuller, and Y . Yang, “Recent event camera innovations: A survey,”arXiv, arXiv:2408.13627, 2024

arXiv 2024
[22]

A 128×128 120 db 15µs latency asynchronous temporal contrast vision sensor,

P. Lichtsteiner, C. Posch, and T. Delbruck, “A 128×128 120 db 15µs latency asynchronous temporal contrast vision sensor,”IEEE Journal of Solid-State Circuits, 2008

2008
[23]

LynX: An event-based gesture dataset for egocentric inter- action in extended reality,

P. Bartoli, V . Jayaprakash, J. Moosmann, P. Mayer, F. Zappa, and M. Magno, “LynX: An event-based gesture dataset for egocentric inter- action in extended reality,”IEEE International Workshop on Advances in Sensors and Interfaces (IWASI), 2025

2025
[24]

Ray-ban meta wayfarer gen 2 — ai glasses,

Meta, “Ray-ban meta wayfarer gen 2 — ai glasses,” accessed: 2025-10- 09

2025
[25]

Hololens 2 hardware,

Microsoft, “Hololens 2 hardware,” 2023, accessed: 2025-10-09

2023
[26]

Frame hardware documentation,

Brilliant Labs, “Frame hardware documentation,” 2025, accessed: 2025- 10-09

2025
[27]

Opensourcesmartglasses,

Mentra-Community, “Opensourcesmartglasses,” 2023, accessed: 2025- 10-09

2023
[28]

Ultra-efficient on-device object detection on AI-integrated smart glasses with TinyissimoYOLO,

J. Moosmann, P. Bonazzi, Y . Li, S. Bian, P. Mayer, L. Benini, and M. Magno, “Ultra-efficient on-device object detection on AI-integrated smart glasses with TinyissimoYOLO,”arXiv, arXiv:2311.01057, 2023

arXiv 2023
[29]

H-Watch: An open, connected platform for AI-enhanced COVID-19 infection symptoms monitoring and contact tracing,

T. Polonelli, L. Schulthess, P. Mayer, M. Magno, and L. Benini, “H-Watch: An open, connected platform for AI-enhanced COVID-19 infection symptoms monitoring and contact tracing,”IEEE International Symposium on Circuits and Systems (ISCAS), 2021

2021
[30]

Event- based solutions for human-centered applications: A comprehensive re- view,

M. Adra, S. Melcarne, N. Mirabet-Herranz, and J.-L. Dugelay, “Event- based solutions for human-centered applications: A comprehensive re- view,”arXiv, arXiv:2502.18490, 2025

arXiv 2025
[31]

Scaling egocentric vision: The EPIC-Kitchens dataset,

D. Damen, H. Doughty, G. M. Farinella, S. Fidler, A. Furnari, E. Kaza- kos, D. Moltisanti, J. Munro, T. Perrett, W. Price, and M. Wray, “Scaling egocentric vision: The EPIC-Kitchens dataset,”European Conference on Computer Vision (ECCV), 2018

2018
[32]

E 2(GO)MOTION: Motion augmented event stream for egocentric action recognition,

C. Plizzari, M. Planamente, G. Goletto, M. Cannici, E. Gusso, M. Mat- teucci, and B. Caputo, “E 2(GO)MOTION: Motion augmented event stream for egocentric action recognition,”IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. IEEE INTERNET OF THINGS JOURNAL, UNDER REVIEW, JUNE 2026 10

2022
[33]

EgoEvGes- ture: Gesture recognition based on egocentric event camera,

L. Wang, H. Shi, X. Yin, K. Yang, K. Wang, and J. Bai, “EgoEvGes- ture: Gesture recognition based on egocentric event camera,”arXiv, arXiv:2503.12419, 2025

arXiv 2025
[34]

Helios: An extremely low power event-based gesture recognition for always-on smart eyewear,

P. Bhattacharyya, J. Mitton, R. Page, O. Morgan, B. Menzies, G. Home- wood, K. Jacobs, P. Baesso, D. Trickett, C. Mair, T. Muhonen, R. Clark, L. Berridge, R. Vigars, and I. Wallace, “Helios: An extremely low power event-based gesture recognition for always-on smart eyewear,”arXiv, arXiv:2407.05206, 2024

arXiv 2024
[35]

Helios 2.0: A robust, ultra-low power gesture recog- nition system optimised for event-sensor based wearables,

P. Bhattacharyya, J. Mitton, R. Page, O. Morgan, O. Powell, B. Menzies, G. Homewood, K. Jacobs, P. Baesso, T. Muhonen, R. Vigars, and L. Berridge, “Helios 2.0: A robust, ultra-low power gesture recog- nition system optimised for event-sensor based wearables,”arXiv, arXiv:2503.07825, 2025

arXiv 2025
[36]

Towards real-time online egocentric action recognition on smart eyewear,

R. Santambrogio, F. Caspani, G. Corti, F. Palermo, S. Mentasti, D. Tro- janiello, and M. Matteucci, “Towards real-time online egocentric action recognition on smart eyewear,”Image Analysis and Processing – ICIAP 2025, 2025

2025
[37]

GAP9Shield: A 150GOPS AI- capable ultra-low power module for vision and ranging applications on nano-drones,

H. M ¨uller, V . Kartsch, and L. Benini, “GAP9Shield: A 150GOPS AI- capable ultra-low power module for vision and ranging applications on nano-drones,”arXiv, arXiv:2407.13706, 2024

arXiv 2024
[38]

Red- MulE: A compact FP16 matrix-multiplication accelerator for adap- tive deep learning on RISC-V-based ultra-low-power SoCs,

Y . Tortorella, L. Bertaccini, D. Rossi, L. Benini, and F. Conti, “Red- MulE: A compact FP16 matrix-multiplication accelerator for adap- tive deep learning on RISC-V-based ultra-low-power SoCs,”arXiv, arXiv:2204.11192, 2022

arXiv 2022
[39]

Siracusa: A 16 nm heterogeneous RISC-V SoC for extended reality with at-MRAM neural engine,

A. S. Prasad, M. Scherer, F. Conti, D. Rossi, A. Di Mauro, M. Eggimann et al., “Siracusa: A 16 nm heterogeneous RISC-V SoC for extended reality with at-MRAM neural engine,”IEEE Journal of Solid-State Circuits, 2024

2024
[40]

HATS: Histograms of averaged time surfaces for robust event-based object classification,

A. Sironi, M. Brambilla, N. Bourdis, X. Lagorce, and R. Benosman, “HATS: Histograms of averaged time surfaces for robust event-based object classification,”IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018

2018
[41]

Event-based sensor GENX320,

Prophesee, “Event-based sensor GENX320,”Prophesee, 2024

2024
[42]

A closer look at spatiotemporal convolutions for action recognition,

D. Tran, H. Wang, L. Torresani, J. Ray, Y . LeCun, and M. Paluri, “A closer look at spatiotemporal convolutions for action recognition,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018

2018
[43]

Learning spatiotemporal features with 3D convolutional networks,

D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, “Learning spatiotemporal features with 3D convolutional networks,”IEEE Inter- national Conference on Computer Vision (ICCV), 2015

2015
[44]

Quo vadis, action recognition? a new model and the Kinetics dataset,

J. a. Carreira and A. Zisserman, “Quo vadis, action recognition? a new model and the Kinetics dataset,”IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2017

2017
[45]

TSM: Temporal shift module for efficient video understanding,

J. Lin, C. Gan, and S. Han, “TSM: Temporal shift module for efficient video understanding,”IEEE/CVF International Conference on Computer Vision (ICCV), 2019

2019
[46]

An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,

S. Bai, J. Z. Kolter, and V . Koltun, “An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,”arXiv, arXiv:1803.01271, 2018

Pith/arXiv arXiv 2018
[47]

ECG-TCN: Wearable cardiac arrhythmia detection with a temporal convolutional network,

T. M. Ingolfsson, X. Wang, M. Hersche, A. Burrello, L. Cavigelli, and L. Benini, “ECG-TCN: Wearable cardiac arrhythmia detection with a temporal convolutional network,”IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2021

2021
[48]

MobileNets: Efficient convolutional neural networks for mobile vision applications,

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient convolutional neural networks for mobile vision applications,”arXiv, arXiv:1704.04861, 2017

Pith/arXiv arXiv 2017

[1] [1]

Intelligent assistants strengthening personhood,

K. J. Hole, “Intelligent assistants strengthening personhood,”IEEE Computer, 2025

2025

[2] [2]

Imaging for all-day wearable smart glasses,

M. Goesele, D. Andersen, Y . Chen, S. Green, E. Ilg, C. Li, J. Liu, G. Kuo, L. Wan, and R. Newcombe, “Imaging for all-day wearable smart glasses,”arXiv, arXiv:2504.13060, 2025

arXiv 2025

[3] [3]

Automatic gaze analysis: A survey of deep learning based approaches,

S. Ghosh, A. Dhall, M. Hayat, J. Knibbe, and Q. Ji, “Automatic gaze analysis: A survey of deep learning based approaches,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024

2024

[4] [4]

Deep learn- ing for facial expression and human activity recognition using smart glasses,

M. Marinova, E. Chona, A. Kotevski, B. Sazdov, I. Kiprijanovska, S. Stankoski, M. Gjoreski, C. Nduka, and H. Gjoreski, “Deep learn- ing for facial expression and human activity recognition using smart glasses,”IEEE Access, 2025

2025

[5] [5]

GAPses: Versatile smart glasses for comfortable and fully-dry acquisition and parallel ultra-low-power processing of EEG and EOG,

S. Frey, M. A. Lucchini, V . Kartsch, T. M. Ingolfsson, A. H. Bernardi, M. Segessenmann, J. Osieleniec, S. Benatti, L. Benini, and A. Cos- settini, “GAPses: Versatile smart glasses for comfortable and fully-dry acquisition and parallel ultra-low-power processing of EEG and EOG,” IEEE Transactions on Biomedical Circuits and Systems, 2025

2025

[6] [6]

ElectraSight: Smart glasses with fully onboard non- invasive eye tracking using hybrid contact and contactless EOG,

N. Sch ¨arer, F. Villani, A. Melatur, S. Peter, T. Polonelli, and M. Magno, “ElectraSight: Smart glasses with fully onboard non- invasive eye tracking using hybrid contact and contactless EOG,”arXiv, arXiv:2412.14848, 2024

arXiv 2024

[7] [7]

mmet: mmwave radar-based eye tracking on smart glasses,

R. Ma, Y . Morimoto, J. S. Ho, S. Shiu, and J. Zhu, “mmet: mmwave radar-based eye tracking on smart glasses,”ACM CHI Conference on Human Factors in Computing Systems, 2025

2025

[8] [8]

Applications of terahertz spectroscopy in the detection and recognition of substances,

X. Fu, Y . Liu, Q. Chen, Y . Fu, and T. J. Cui, “Applications of terahertz spectroscopy in the detection and recognition of substances,”Frontiers in Physics, 2022

2022

[9] [9]

Seven HCI grand challenges revisited: Five-year progress,

C. Stephanidis, G. Salvendy, M. Antona, V . G. Duffy, Q. Gao, W. Kar- wowski, S. Konomi, F. Nah, S. Ntoa, P.-L. P. Rau, K. Siau, and J. Zhou, “Seven HCI grand challenges revisited: Five-year progress,” International Journal of Human–Computer Interaction, 2025

2025

[10] [10]

Meta smart glasses—large language models and the future for assistive glasses for individuals with vision impairments,

E. Waisberg, J. Ong, M. Masalkhi, N. Zaman, P. Sarker, A. G. Lee, and A. Tavakkoli, “Meta smart glasses—large language models and the future for assistive glasses for individuals with vision impairments,”Eye, 2024

2024

[11] [11]

EmBARDiment: An embodied AI agent for productivity in XR,

R. Bovo, S. Abreu, K. Ahuja, E. J. Gonzalez, L.-T. Cheng, and M. Gonzalez-Franco, “EmBARDiment: An embodied AI agent for productivity in XR,”IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2025

2025

[12] [12]

A systematic literature review on integrating AI-powered smart glasses into digital health management for proactive healthcare solutions,

B. Wang, Y . Zheng, X. Han, L. Kong, G. Xiao, Z. Xiao, and S. Chen, “A systematic literature review on integrating AI-powered smart glasses into digital health management for proactive healthcare solutions,”npj Digital Medicine, 2025

2025

[13] [13]

Toward attention-based TinyML: A heteroge- neous accelerated architecture and automated deployment flow,

P. Wiese, G. ˙Islamo˘glu, M. Scherer, L. Macan, V . J. Jung, A. Burrello, F. Conti, and L. Benini, “Toward attention-based TinyML: A heteroge- neous accelerated architecture and automated deployment flow,”IEEE Design & Test, 2025

2025

[14] [14]

GAP-8: A RISC-V SoC for AI at the edge of the IoT,

E. Flamand, D. Rossi, F. Conti, I. Loi, A. Pullini, F. Rotenberg, and L. Benini, “GAP-8: A RISC-V SoC for AI at the edge of the IoT,”IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP), 2018

2018

[15] [15]

PULP- NN: Accelerating quantized neural networks on parallel ultra-low-power RISC-V processors,

A. Garofalo, M. Rusci, F. Conti, D. Rossi, and L. Benini, “PULP- NN: Accelerating quantized neural networks on parallel ultra-low-power RISC-V processors,”Philosophical Transactions of the Royal Society A, 2020

2020

[16] [16]

Memory-driven mixed low precision quantization for enabling deep network inference on micro- controllers,

M. Rusci, A. Capotondi, and L. Benini, “Memory-driven mixed low precision quantization for enabling deep network inference on micro- controllers,”arXiv, arXiv:1905.13082, 2020

Pith/arXiv arXiv 1905

[17] [17]

MLPerf tiny benchmark,

C. Banbury, V . J. Reddi, M. Lam, W. Fu, A. Fazel, J. Holleman, X. Huang, R. Hurtado, D. Kanter, A. Lokhmotov, D. Patterson, D. Pau, J.-s. Seo, J. Sieracki, U. Thakker, M. Verhelst, and P. Yadav, “MLPerf tiny benchmark,”arXiv, arXiv:2106.07597, 2021

arXiv 2021

[18] [18]

MCUNet: Tiny deep learning on IoT devices,

J. Lin, W.-M. Chen, Y . Lin, J. Cohn, C. Gan, and S. Han, “MCUNet: Tiny deep learning on IoT devices,”arXiv, arXiv:2007.10319, 2020

arXiv 2007

[19] [19]

TensorFlow Lite Micro: Embedded machine learning on TinyML systems,

R. David, J. Duke, A. Jain, V . Janapa Reddi, N. Jeffries, J. Li, N. Kreeger, I. Nappier, M. Natraj, T. Wang, and P. Warden, “TensorFlow Lite Micro: Embedded machine learning on TinyML systems,”arXiv, arXiv:2010.08678, 2021

arXiv 2010

[20] [20]

Event-based vision: A survey,

G. Gallego, T. Delbr ¨uck, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis, and D. Scara- muzza, “Event-based vision: A survey,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022

2022

[21] [21]

Recent event camera innovations: A survey,

B. Chakravarthi, A. A. Verma, K. Daniilidis, C. Fermuller, and Y . Yang, “Recent event camera innovations: A survey,”arXiv, arXiv:2408.13627, 2024

arXiv 2024

[22] [22]

A 128×128 120 db 15µs latency asynchronous temporal contrast vision sensor,

P. Lichtsteiner, C. Posch, and T. Delbruck, “A 128×128 120 db 15µs latency asynchronous temporal contrast vision sensor,”IEEE Journal of Solid-State Circuits, 2008

2008

[23] [23]

LynX: An event-based gesture dataset for egocentric inter- action in extended reality,

P. Bartoli, V . Jayaprakash, J. Moosmann, P. Mayer, F. Zappa, and M. Magno, “LynX: An event-based gesture dataset for egocentric inter- action in extended reality,”IEEE International Workshop on Advances in Sensors and Interfaces (IWASI), 2025

2025

[24] [24]

Ray-ban meta wayfarer gen 2 — ai glasses,

Meta, “Ray-ban meta wayfarer gen 2 — ai glasses,” accessed: 2025-10- 09

2025

[25] [25]

Hololens 2 hardware,

Microsoft, “Hololens 2 hardware,” 2023, accessed: 2025-10-09

2023

[26] [26]

Frame hardware documentation,

Brilliant Labs, “Frame hardware documentation,” 2025, accessed: 2025- 10-09

2025

[27] [27]

Opensourcesmartglasses,

Mentra-Community, “Opensourcesmartglasses,” 2023, accessed: 2025- 10-09

2023

[28] [28]

Ultra-efficient on-device object detection on AI-integrated smart glasses with TinyissimoYOLO,

J. Moosmann, P. Bonazzi, Y . Li, S. Bian, P. Mayer, L. Benini, and M. Magno, “Ultra-efficient on-device object detection on AI-integrated smart glasses with TinyissimoYOLO,”arXiv, arXiv:2311.01057, 2023

arXiv 2023

[29] [29]

H-Watch: An open, connected platform for AI-enhanced COVID-19 infection symptoms monitoring and contact tracing,

T. Polonelli, L. Schulthess, P. Mayer, M. Magno, and L. Benini, “H-Watch: An open, connected platform for AI-enhanced COVID-19 infection symptoms monitoring and contact tracing,”IEEE International Symposium on Circuits and Systems (ISCAS), 2021

2021

[30] [30]

Event- based solutions for human-centered applications: A comprehensive re- view,

M. Adra, S. Melcarne, N. Mirabet-Herranz, and J.-L. Dugelay, “Event- based solutions for human-centered applications: A comprehensive re- view,”arXiv, arXiv:2502.18490, 2025

arXiv 2025

[31] [31]

Scaling egocentric vision: The EPIC-Kitchens dataset,

D. Damen, H. Doughty, G. M. Farinella, S. Fidler, A. Furnari, E. Kaza- kos, D. Moltisanti, J. Munro, T. Perrett, W. Price, and M. Wray, “Scaling egocentric vision: The EPIC-Kitchens dataset,”European Conference on Computer Vision (ECCV), 2018

2018

[32] [32]

E 2(GO)MOTION: Motion augmented event stream for egocentric action recognition,

C. Plizzari, M. Planamente, G. Goletto, M. Cannici, E. Gusso, M. Mat- teucci, and B. Caputo, “E 2(GO)MOTION: Motion augmented event stream for egocentric action recognition,”IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. IEEE INTERNET OF THINGS JOURNAL, UNDER REVIEW, JUNE 2026 10

2022

[33] [33]

EgoEvGes- ture: Gesture recognition based on egocentric event camera,

L. Wang, H. Shi, X. Yin, K. Yang, K. Wang, and J. Bai, “EgoEvGes- ture: Gesture recognition based on egocentric event camera,”arXiv, arXiv:2503.12419, 2025

arXiv 2025

[34] [34]

Helios: An extremely low power event-based gesture recognition for always-on smart eyewear,

P. Bhattacharyya, J. Mitton, R. Page, O. Morgan, B. Menzies, G. Home- wood, K. Jacobs, P. Baesso, D. Trickett, C. Mair, T. Muhonen, R. Clark, L. Berridge, R. Vigars, and I. Wallace, “Helios: An extremely low power event-based gesture recognition for always-on smart eyewear,”arXiv, arXiv:2407.05206, 2024

arXiv 2024

[35] [35]

Helios 2.0: A robust, ultra-low power gesture recog- nition system optimised for event-sensor based wearables,

P. Bhattacharyya, J. Mitton, R. Page, O. Morgan, O. Powell, B. Menzies, G. Homewood, K. Jacobs, P. Baesso, T. Muhonen, R. Vigars, and L. Berridge, “Helios 2.0: A robust, ultra-low power gesture recog- nition system optimised for event-sensor based wearables,”arXiv, arXiv:2503.07825, 2025

arXiv 2025

[36] [36]

Towards real-time online egocentric action recognition on smart eyewear,

R. Santambrogio, F. Caspani, G. Corti, F. Palermo, S. Mentasti, D. Tro- janiello, and M. Matteucci, “Towards real-time online egocentric action recognition on smart eyewear,”Image Analysis and Processing – ICIAP 2025, 2025

2025

[37] [37]

GAP9Shield: A 150GOPS AI- capable ultra-low power module for vision and ranging applications on nano-drones,

H. M ¨uller, V . Kartsch, and L. Benini, “GAP9Shield: A 150GOPS AI- capable ultra-low power module for vision and ranging applications on nano-drones,”arXiv, arXiv:2407.13706, 2024

arXiv 2024

[38] [38]

Red- MulE: A compact FP16 matrix-multiplication accelerator for adap- tive deep learning on RISC-V-based ultra-low-power SoCs,

Y . Tortorella, L. Bertaccini, D. Rossi, L. Benini, and F. Conti, “Red- MulE: A compact FP16 matrix-multiplication accelerator for adap- tive deep learning on RISC-V-based ultra-low-power SoCs,”arXiv, arXiv:2204.11192, 2022

arXiv 2022

[39] [39]

Siracusa: A 16 nm heterogeneous RISC-V SoC for extended reality with at-MRAM neural engine,

A. S. Prasad, M. Scherer, F. Conti, D. Rossi, A. Di Mauro, M. Eggimann et al., “Siracusa: A 16 nm heterogeneous RISC-V SoC for extended reality with at-MRAM neural engine,”IEEE Journal of Solid-State Circuits, 2024

2024

[40] [40]

HATS: Histograms of averaged time surfaces for robust event-based object classification,

A. Sironi, M. Brambilla, N. Bourdis, X. Lagorce, and R. Benosman, “HATS: Histograms of averaged time surfaces for robust event-based object classification,”IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018

2018

[41] [41]

Event-based sensor GENX320,

Prophesee, “Event-based sensor GENX320,”Prophesee, 2024

2024

[42] [42]

A closer look at spatiotemporal convolutions for action recognition,

D. Tran, H. Wang, L. Torresani, J. Ray, Y . LeCun, and M. Paluri, “A closer look at spatiotemporal convolutions for action recognition,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018

2018

[43] [43]

Learning spatiotemporal features with 3D convolutional networks,

D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, “Learning spatiotemporal features with 3D convolutional networks,”IEEE Inter- national Conference on Computer Vision (ICCV), 2015

2015

[44] [44]

Quo vadis, action recognition? a new model and the Kinetics dataset,

J. a. Carreira and A. Zisserman, “Quo vadis, action recognition? a new model and the Kinetics dataset,”IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2017

2017

[45] [45]

TSM: Temporal shift module for efficient video understanding,

J. Lin, C. Gan, and S. Han, “TSM: Temporal shift module for efficient video understanding,”IEEE/CVF International Conference on Computer Vision (ICCV), 2019

2019

[46] [46]

An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,

S. Bai, J. Z. Kolter, and V . Koltun, “An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,”arXiv, arXiv:1803.01271, 2018

Pith/arXiv arXiv 2018

[47] [47]

ECG-TCN: Wearable cardiac arrhythmia detection with a temporal convolutional network,

T. M. Ingolfsson, X. Wang, M. Hersche, A. Burrello, L. Cavigelli, and L. Benini, “ECG-TCN: Wearable cardiac arrhythmia detection with a temporal convolutional network,”IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2021

2021

[48] [48]

MobileNets: Efficient convolutional neural networks for mobile vision applications,

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient convolutional neural networks for mobile vision applications,”arXiv, arXiv:1704.04861, 2017

Pith/arXiv arXiv 2017