DAG-MoE uses a lightweight module to learn DAG-based structural aggregation of selected experts, expanding combination space and enabling intra-layer multi-step reasoning compared to standard weighted-sum MoE.
arXiv preprint arXiv:2501.15103 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
citing papers explorer
-
DAG-MoE: From Simple Mixture to Structural Aggregation in Mixture-of-Experts
DAG-MoE uses a lightweight module to learn DAG-based structural aggregation of selected experts, expanding combination space and enabling intra-layer multi-step reasoning compared to standard weighted-sum MoE.