DICE achieves 1.77-1.90x dynamic energy efficiency and 42-46% power reduction versus modeled NVIDIA Turing SMs by executing SIMT workloads on pipelined CGRAs with p-graphs handling dynamism and 68% fewer register file accesses.
Stream-dataflow acceleration
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.AR 2years
2026 2representative citing papers
CODO automates comprehensive dataflow optimization on FPGAs, achieving 1.45x-4.52x speedups on kernels and up to 33.8x on DNN models over state-of-the-art frameworks.
citing papers explorer
-
DICE: Enabling Efficient General-Purpose SIMT Execution with Statically Scheduled Coarse-Grained Reconfigurable Arrays
DICE achieves 1.77-1.90x dynamic energy efficiency and 42-46% power reduction versus modeled NVIDIA Turing SMs by executing SIMT workloads on pipelined CGRAs with p-graphs handling dynamism and 68% fewer register file accesses.
-
CODO: An Automated Compiler for Comprehensive Dataflow Optimization
CODO automates comprehensive dataflow optimization on FPGAs, achieving 1.45x-4.52x speedups on kernels and up to 33.8x on DNN models over state-of-the-art frameworks.