A co-design framework using approximate matrix decomposition and genetic algorithms delivers 33% average latency reduction in TinyML CNN FPGA accelerators with 1.3% average accuracy loss versus standard systolic arrays.
XpulpNN: Enabling energy efficient and flexible inference of quan- tized neural networks on RISC-V based IoT end nodes
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Co-Design of CNN Accelerators for TinyML using Approximate Matrix Decomposition
A co-design framework using approximate matrix decomposition and genetic algorithms delivers 33% average latency reduction in TinyML CNN FPGA accelerators with 1.3% average accuracy loss versus standard systolic arrays.