A structural pruning framework for MoE models that solves channel-score coverage maximization via attribution approximation, preserving accuracy at 50% or 25% pruning plus 4-bit quantization on DeepSeek and Qwen models.
Sensitivity and Robustness Analysis C.4.1
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Attribution-Guided and Coverage-Maximized Pruning for Structural MoE Compression
A structural pruning framework for MoE models that solves channel-score coverage maximization via attribution approximation, preserving accuracy at 50% or 25% pruning plus 4-bit quantization on DeepSeek and Qwen models.