Xmoe: Sparse models with fine-grained and adaptive expert selection

· 2024 · arXiv 2403.18926

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Analytical FFN-to-MoE Restructuring via Activation Pattern Analysis

cs.LG · 2025-02-06 · unverdicted · novelty 6.0

An analytical post-training method restructures FFNs into MoE by partitioning neurons based on activation patterns and building a router from statistics, achieving 1.17x speedup with minimal resources.

Beyond Uniform Experts: Cost-Aware Expert Execution for Efficient Multi-Device MoE Inference

cs.DC · 2026-06-29 · unverdicted · novelty 5.0

CAEE reduces MoE inference latency 8-18% on 671B DeepSeek-R1 by cost-aware expert pruning and low-overhead compensation while keeping accuracy drop under 1%.

citing papers explorer

Showing 2 of 2 citing papers.

Analytical FFN-to-MoE Restructuring via Activation Pattern Analysis cs.LG · 2025-02-06 · unverdicted · none · ref 17
An analytical post-training method restructures FFNs into MoE by partitioning neurons based on activation patterns and building a router from statistics, achieving 1.17x speedup with minimal resources.
Beyond Uniform Experts: Cost-Aware Expert Execution for Efficient Multi-Device MoE Inference cs.DC · 2026-06-29 · unverdicted · none · ref 38
CAEE reduces MoE inference latency 8-18% on 671B DeepSeek-R1 by cost-aware expert pruning and low-overhead compensation while keeping accuracy drop under 1%.

Xmoe: Sparse models with fine-grained and adaptive expert selection

fields

years

verdicts

representative citing papers

citing papers explorer