Accelerating sparse deep neural networks.arXiv preprint arXiv:2104.08378

Asit Mishra, Jorge Albericio Latorre, Jeff Pool, Darko Stosic, Dusan Stosic, Ganesh Venkatesh, Chong Yu, Paulius Micikevicius · 2021 · arXiv 2104.08378

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

representative citing papers

SparseForge: Efficient Semi-Structured LLM Sparsification via Annealing of Hessian-Guided Soft-Mask

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

SparseForge achieves 57.27% zero-shot accuracy on LLaMA-2-7B at 2:4 sparsity using only 5B retraining tokens, beating the dense baseline and nearly matching a 40B-token SOTA method.

SLaB: Sparse-Lowrank-Binary Decomposition for Efficient Large Language Models

cs.LG · 2026-04-06 · unverdicted · novelty 6.0

SLaB compresses LLM weights via sparse-lowrank-binary decomposition guided by activation-aware scores, achieving up to 36% lower perplexity than prior methods at 50% compression on Llama models.

ELAS: Efficient Pre-Training of Low-Rank Large Language Models via 2:4 Activation Sparsity

cs.LG · 2026-05-05 · unverdicted · novelty 5.0

ELAS pre-trains low-rank LLMs by applying 2:4 activation sparsity after squared ReLU to cut memory and accelerate training with minimal performance loss.

Adaptive Norm-Based Regularization for Neural Networks

stat.ML · 2026-04-30 · unverdicted · novelty 5.0

Covariance-aware ridge and combined l1-l2 regularizers for neural networks yield better predictive performance and complexity control than standard penalties in simulations and applications to cooling-load prediction and leukemia classification.

HieraSparse: Hierarchical Semi-Structured Sparse KV Attention

cs.DC · 2026-04-18 · unverdicted · novelty 5.0

HieraSparse delivers a hierarchical semi-structured sparse KV attention system that achieves 1.2x KV compression and 4.57x decode attention speedup versus prior unstructured sparsity methods at equivalent sparsity, plus up to 1.85x prefill speedup and 1.37x/1.77x speedups with magnitude pruning and

citing papers explorer

Showing 5 of 5 citing papers.

SparseForge: Efficient Semi-Structured LLM Sparsification via Annealing of Hessian-Guided Soft-Mask cs.LG · 2026-05-07 · unverdicted · none · ref 24
SparseForge achieves 57.27% zero-shot accuracy on LLaMA-2-7B at 2:4 sparsity using only 5B retraining tokens, beating the dense baseline and nearly matching a 40B-token SOTA method.
SLaB: Sparse-Lowrank-Binary Decomposition for Efficient Large Language Models cs.LG · 2026-04-06 · unverdicted · none · ref 21
SLaB compresses LLM weights via sparse-lowrank-binary decomposition guided by activation-aware scores, achieving up to 36% lower perplexity than prior methods at 50% compression on Llama models.
ELAS: Efficient Pre-Training of Low-Rank Large Language Models via 2:4 Activation Sparsity cs.LG · 2026-05-05 · unverdicted · none · ref 6
ELAS pre-trains low-rank LLMs by applying 2:4 activation sparsity after squared ReLU to cut memory and accelerate training with minimal performance loss.
Adaptive Norm-Based Regularization for Neural Networks stat.ML · 2026-04-30 · unverdicted · none · ref 25
Covariance-aware ridge and combined l1-l2 regularizers for neural networks yield better predictive performance and complexity control than standard penalties in simulations and applications to cooling-load prediction and leukemia classification.
HieraSparse: Hierarchical Semi-Structured Sparse KV Attention cs.DC · 2026-04-18 · unverdicted · none · ref 40
HieraSparse delivers a hierarchical semi-structured sparse KV attention system that achieves 1.2x KV compression and 4.57x decode attention speedup versus prior unstructured sparsity methods at equivalent sparsity, plus up to 1.85x prefill speedup and 1.37x/1.77x speedups with magnitude pruning and

Accelerating sparse deep neural networks.arXiv preprint arXiv:2104.08378

fields

years

verdicts

representative citing papers

citing papers explorer