TENP applies trapezoidal expert-neuron pruning to MoE models, retaining key experts while pruning others via projected neuron contribution, yielding only 1-point accuracy drop at 40% sparsity on DeepSeek with 10% code-generation gain.
F uxi T ranyu: A Multilingual Large Language Model Trained with Balanced Data
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
TENP: Trapezoidal Expert Neuron Pruning For Mixture-of-Experts
TENP applies trapezoidal expert-neuron pruning to MoE models, retaining key experts while pruning others via projected neuron contribution, yielding only 1-point accuracy drop at 40% sparsity on DeepSeek with 10% code-generation gain.