Low-Cost Neuromorphic Fall Detection Using Synthetic Event Data and Hybrid SNNs

Daniel Yunge; Gonzalo Soto; Guillermo Rojas

arxiv: 2606.18732 · v1 · pith:TJKS6GXGnew · submitted 2026-06-17 · 💻 cs.LG · cs.CV

Low-Cost Neuromorphic Fall Detection Using Synthetic Event Data and Hybrid SNNs

Guillermo Rojas , Gonzalo Soto , Daniel Yunge This is my paper

Pith reviewed 2026-06-26 21:20 UTC · model grok-4.3

classification 💻 cs.LG cs.CV

keywords fall detectionspiking neural networksevent-based visionhybrid SNN-CNNsynthetic event datadynamic vision sensorenergy efficiencyneuromorphic computing

0 comments

The pith

Hybrid SNN-CNN models detect falls from synthetic event data generated by smartphone videos with efficiency gains and no accuracy loss.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops hybrid models combining spiking neural networks with convolutional components to process event-based data simulated from ordinary smartphone videos, focused on human fall detection. These models are tested via simulation on several datasets against traditional machine learning baselines. A sympathetic reader would care because the results point to low-power neuromorphic systems that could run on edge devices using only accessible video sources for training data.

Core claim

Hybrid architectures that merge spiking layers with convolutional elements, when trained on event streams synthesized from conventional video frames, achieve fall detection performance comparable to standard models while delivering substantial efficiency improvements, validating the use of SNNs and DVS-style data for this task without specialized sensors.

What carries the argument

Hybrid SNNs integrating spiking neural layers with CNN components, trained on synthetic Dynamic Vision Sensor event data converted from smartphone video frames.

If this is right

Energy consumption drops for fall detection while accuracy holds steady against conventional models.
Development of neuromorphic applications becomes feasible without access to real event cameras during training.
The approach extends to other spatio-temporal recognition tasks in resource-limited settings.
Simulated data pipelines lower barriers to testing SNN viability on complex real-world problems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Direct hardware trials on edge neuromorphic chips would reveal whether simulation-to-reality gaps affect latency or power in deployed systems.
Combining the hybrid models with additional low-cost sensors could address edge cases like varying lighting or clothing that simulation may under-represent.
The same conversion method from video to events could support rapid prototyping for other detection tasks such as gesture or activity recognition.

Load-bearing premise

Event data simulated from smartphone videos reproduces the timing statistics and noise properties of real DVS hardware closely enough for models to transfer without performance degradation.

What would settle it

Deploying the trained hybrid models on recordings from physical DVS cameras and measuring whether accuracy or energy metrics deviate substantially from the simulated results.

Figures

Figures reproduced from arXiv: 2606.18732 by Daniel Yunge, Gonzalo Soto, Guillermo Rojas.

**Figure 3.** Figure 3: Hybrid neural network designed for NFDD dataset. [PITH_FULL_IMAGE:figures/full_fig_p002_3.png] view at source ↗

**Figure 4.** Figure 4: Falling, sitting and walking classes transformed into [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗

**Figure 5.** Figure 5: Training accuracy and loss history of DVS128 after [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗

**Figure 6.** Figure 6: Training and validation of the Neural Network with [PITH_FULL_IMAGE:figures/full_fig_p004_6.png] view at source ↗

read the original abstract

This work presents the development of hybrid models that integrate spiking neural networks (SNNs) with components of convolutional neural networks (CNNs) to learn from simulated event-based camera data (Dynamic Vision Sensor, DVS) generated from conventional smartphone videos. Aimed primarily at human fall detection, the approach leverages the energy efficiency and spatio-temporal processing capabilities of SNNs by converting video frames into event-based data. The proposed models are evaluated through simulations on multiple datasets, comparing their performance to that of traditional machine learning models. Results demonstrate significant gains in efficiency without sacrificing accuracy, underscoring the potential of combining SNNs and DVS technology for complex tasks in real-world environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper runs hybrid SNN-CNNs on synthetic events from smartphone video for fall detection and reports efficiency gains in simulation, but the missing real-DVS comparison leaves the neuromorphic claims unproven.

read the letter

Colleague,

The paper converts ordinary smartphone videos into event streams and trains hybrid spiking-convolutional models for fall detection. It compares those models against standard machine learning baselines across several datasets and states that efficiency improves while accuracy holds.

There is no new core technique. Hybrid SNN-CNN architectures and frame-to-event conversion are established workarounds. What the authors do is apply the combination to this specific healthcare task and run the numbers in simulation. That produces a usable example pipeline if the conversion code or models are released.

The soft spot is the data generation step. The events come from frame differencing or similar processing of RGB video. That process omits photon noise, per-pixel threshold variation, refractory periods, and contrast-dependent latency that real Dynamic Vision Sensors exhibit. The paper therefore measures performance only on idealized event streams. Without at least one matched recording of the same scenes on actual DVS hardware, the accuracy and efficiency numbers cannot be read as evidence that the system will behave the same way on neuromorphic edge devices.

The abstract still frames the work as demonstrating potential for real-world DVS environments. That framing rests on an assumption the experiments do not test. The rest of the paper appears internally consistent and stays within simulation, so the limitation is clear rather than hidden.

The work is mainly for groups already exploring event-based methods for always-on monitoring who need a starting point that avoids buying cameras. Readers focused on deployable hardware would want the transfer question addressed first.

It is worth sending to referees. The application is concrete, the setup is reproducible in principle, and a review can require either hardware validation or a tighter claim about what the simulation actually shows. Not a desk reject.

Referee Report

2 major / 1 minor

Summary. The manuscript describes the creation of hybrid spiking neural network (SNN) models that incorporate convolutional neural network (CNN) elements for processing synthetic event-based data generated from standard smartphone videos. Focused on fall detection, the models are tested on multiple datasets and compared to conventional machine learning techniques, with claims of improved efficiency at comparable accuracy levels, pointing to the promise of neuromorphic approaches in practical scenarios.

Significance. Should the synthetic event simulation prove representative of actual Dynamic Vision Sensor (DVS) output, the method offers a pathway to develop energy-efficient fall detection systems using readily available video sources rather than specialized cameras. This could accelerate research in neuromorphic computing for healthcare applications. The hybrid model design is a sensible way to handle the sparse, asynchronous nature of event data.

major comments (2)

[Abstract] Abstract: The statement regarding 'significant gains in efficiency without sacrificing accuracy' provides no supporting quantitative evidence, such as specific accuracy rates, efficiency metrics, dataset descriptions, or baseline comparisons. This omission makes it impossible to verify the central performance claims.
[Methods] Data simulation approach (methods section): The conversion of RGB video to event data does not incorporate real DVS characteristics including photon noise, latency, and contrast threshold variations. The absence of a direct comparison between synthetic and real DVS recordings on the same fall scenarios undermines the assertion of potential for real-world neuromorphic deployment.

minor comments (1)

[Abstract] The abstract would be strengthened by the inclusion of at least one key numerical result to substantiate the efficiency and accuracy claims.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive feedback. We address the major comments point by point below, proposing revisions where they strengthen the manuscript without misrepresenting our work.

read point-by-point responses

Referee: [Abstract] Abstract: The statement regarding 'significant gains in efficiency without sacrificing accuracy' provides no supporting quantitative evidence, such as specific accuracy rates, efficiency metrics, dataset descriptions, or baseline comparisons. This omission makes it impossible to verify the central performance claims.

Authors: We agree that the abstract would be strengthened by including quantitative details. In the revised version, we will update the abstract to report specific accuracy rates, efficiency metrics (such as energy or latency comparisons), dataset names, and baseline results drawn from the experiments in the results section. revision: yes
Referee: [Methods] Data simulation approach (methods section): The conversion of RGB video to event data does not incorporate real DVS characteristics including photon noise, latency, and contrast threshold variations. The absence of a direct comparison between synthetic and real DVS recordings on the same fall scenarios undermines the assertion of potential for real-world neuromorphic deployment.

Authors: The simulation employs a standard RGB-to-event conversion without additional modeling of photon noise, latency, or variable contrast thresholds, as this is a low-cost synthetic approach. We will revise the methods section to describe the conversion process more explicitly and add a dedicated limitations paragraph in the discussion that qualifies the real-world deployment claims. A direct side-by-side comparison with real DVS recordings on identical scenarios is not possible with the data available in this study. revision: partial

standing simulated objections not resolved

Direct comparison between the synthetic event data and real DVS recordings on the same fall scenarios

Circularity Check

0 steps flagged

No circularity: empirical ML evaluation with no derivations or self-referential predictions

full rationale

The paper describes an empirical study training hybrid SNN-CNN models on synthetic event data generated from smartphone videos for fall detection, with performance compared to traditional ML models. No equations, first-principles derivations, fitted parameters presented as predictions, or uniqueness theorems appear in the abstract or described content. Claims rest on simulation results rather than any mathematical chain that reduces to its own inputs by construction. This is a standard supervised learning setup without the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract contains no mathematical derivations, parameters, or explicit assumptions; no free parameters, axioms, or invented entities can be identified.

pith-pipeline@v0.9.1-grok · 5641 in / 975 out tokens · 19362 ms · 2026-06-26T21:20:44.996754+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

12 extracted references · 1 linked inside Pith

[1]

Event-based vision: A survey,

G. Gallego, T. Delbruck, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis, and D. Scara- muzza, “Event-based vision: A survey,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 1, pp. 154–180, 2020

2020
[2]

Ambient assisted living: A review of technologies, methodologies and future perspectives for healthy aging of population,

G. Cicirelli, R. Marani, A. Petitti, A. Milella, and T. D’Orazio, “Ambient assisted living: A review of technologies, methodologies and future perspectives for healthy aging of population,”Sensors, vol. 21, no. 10, p. 3549, 2021

2021
[3]

Elderly fall detection systems: A literature survey,

X. Wang, J. Ellul, and G. Azzopardi, “Elderly fall detection systems: A literature survey,”Frontiers in Robotics and AI, vol. 7, p. 71, 2020

2020
[4]

Sensor technologies for fall detection systems: A review,

A. Singhet al., “Sensor technologies for fall detection systems: A review,”IEEE Sensors Journal, vol. 20, no. 13, pp. 6889–6919, 2020

2020
[5]

Fall detection with event-based data: A case study,

X. Wanget al., “Fall detection with event-based data: A case study,” in International Conference on Computer Analysis of Images and Patterns. Springer Nature Switzerland, 2023

2023
[6]

v2e: From video frames to realistic dvs events,

Y . Hu, S.-C. Liu, and T. Delbruck, “v2e: From video frames to realistic dvs events,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021

2021
[7]

Hybrid snn-based privacy-preserving fall detection using neuromorphic sensors,

S. S. Prasadet al., “Hybrid snn-based privacy-preserving fall detection using neuromorphic sensors,” inProceedings of the F ourteenth Indian Conference on Computer Vision, Graphics and Image Processing, 2023

2023
[8]

Embedded real-time fall detection using deep learning for elderly care,

H. Leeet al., “Embedded real-time fall detection using deep learning for elderly care,”arXiv preprint arXiv:1711.11200, 2017

Pith/arXiv arXiv 2017
[9]

Benchmarking conventional vision models on neuromorphic fall detection and action recognition dataset,

K. S. Krishnan and K. S. Krishnan, “Benchmarking conventional vision models on neuromorphic fall detection and action recognition dataset,” in2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC). IEEE, 2022

2022
[10]

Care: A dynamic stereo vision sensor system for fall detection,

A. N. Belbachiret al., “Care: A dynamic stereo vision sensor system for fall detection,” in2012 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 2012

2012
[11]

Nfdd snn kaggle notebook,

G. Rojas, “Nfdd snn kaggle notebook,” 2024, [Online]. Available: https://www.kaggle.com/code/apemangr/nfdd-snn

2024
[12]

A low power, fully event-based gesture recognition system,

A. Amiret al., “A low power, fully event-based gesture recognition system,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017

2017

[1] [1]

Event-based vision: A survey,

G. Gallego, T. Delbruck, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis, and D. Scara- muzza, “Event-based vision: A survey,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 1, pp. 154–180, 2020

2020

[2] [2]

Ambient assisted living: A review of technologies, methodologies and future perspectives for healthy aging of population,

G. Cicirelli, R. Marani, A. Petitti, A. Milella, and T. D’Orazio, “Ambient assisted living: A review of technologies, methodologies and future perspectives for healthy aging of population,”Sensors, vol. 21, no. 10, p. 3549, 2021

2021

[3] [3]

Elderly fall detection systems: A literature survey,

X. Wang, J. Ellul, and G. Azzopardi, “Elderly fall detection systems: A literature survey,”Frontiers in Robotics and AI, vol. 7, p. 71, 2020

2020

[4] [4]

Sensor technologies for fall detection systems: A review,

A. Singhet al., “Sensor technologies for fall detection systems: A review,”IEEE Sensors Journal, vol. 20, no. 13, pp. 6889–6919, 2020

2020

[5] [5]

Fall detection with event-based data: A case study,

X. Wanget al., “Fall detection with event-based data: A case study,” in International Conference on Computer Analysis of Images and Patterns. Springer Nature Switzerland, 2023

2023

[6] [6]

v2e: From video frames to realistic dvs events,

Y . Hu, S.-C. Liu, and T. Delbruck, “v2e: From video frames to realistic dvs events,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021

2021

[7] [7]

Hybrid snn-based privacy-preserving fall detection using neuromorphic sensors,

S. S. Prasadet al., “Hybrid snn-based privacy-preserving fall detection using neuromorphic sensors,” inProceedings of the F ourteenth Indian Conference on Computer Vision, Graphics and Image Processing, 2023

2023

[8] [8]

Embedded real-time fall detection using deep learning for elderly care,

H. Leeet al., “Embedded real-time fall detection using deep learning for elderly care,”arXiv preprint arXiv:1711.11200, 2017

Pith/arXiv arXiv 2017

[9] [9]

Benchmarking conventional vision models on neuromorphic fall detection and action recognition dataset,

K. S. Krishnan and K. S. Krishnan, “Benchmarking conventional vision models on neuromorphic fall detection and action recognition dataset,” in2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC). IEEE, 2022

2022

[10] [10]

Care: A dynamic stereo vision sensor system for fall detection,

A. N. Belbachiret al., “Care: A dynamic stereo vision sensor system for fall detection,” in2012 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 2012

2012

[11] [11]

Nfdd snn kaggle notebook,

G. Rojas, “Nfdd snn kaggle notebook,” 2024, [Online]. Available: https://www.kaggle.com/code/apemangr/nfdd-snn

2024

[12] [12]

A low power, fully event-based gesture recognition system,

A. Amiret al., “A low power, fully event-based gesture recognition system,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017

2017