A pruning-quantization-Huffman pipeline compresses deep neural networks 35-49x without accuracy loss.
To avoid variance, we measured the time spent on each layer for 4096 input samples, and averaged the time regarding each input sample
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2015 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
A pruning-quantization-Huffman pipeline compresses deep neural networks 35-49x without accuracy loss.