Profiling shows persistent expert load imbalance and domain-specific activation patterns in large MoE models; workload-aware grouping and placement reduce all-to-all communication volume by up to 20x.
2024.3399654
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
Causal Software Engineering is proposed as a paradigm that applies causal models and counterfactual reasoning to inform high-stakes decisions throughout software development and operations.
The paper surveys energy efficiency strategies for Agentic AI inference by proposing a new accounting framework and taxonomy that spans model simplification, computation control, input optimization, and cross-layer co-design with wireless networks.
Hierarchical clustering generates fog colony candidates from device data; NSGA-II selects subsets optimizing network latency and placement runtime across nine scenarios with up to 137 generations needed to dominate controls.
citing papers explorer
-
Scaling Multi-Node Mixture-of-Experts Inference Using Expert Activation Patterns
Profiling shows persistent expert load imbalance and domain-specific activation patterns in large MoE models; workload-aware grouping and placement reduce all-to-all communication volume by up to 20x.
-
Causal Software Engineering: A Vision and Roadmap
Causal Software Engineering is proposed as a paradigm that applies causal models and counterfactual reasoning to inform high-stakes decisions throughout software development and operations.
-
Networking-Aware Energy Efficiency in Agentic AI Inference: A Survey
The paper surveys energy efficiency strategies for Agentic AI inference by proposing a new accounting framework and taxonomy that spans model simplification, computation control, input optimization, and cross-layer co-design with wireless networks.
-
Genetic-based fog colony optimization hybridized with hierarchical clustering and its influence in the placement of fog services
Hierarchical clustering generates fog colony candidates from device data; NSGA-II selects subsets optimizing network latency and placement runtime across nine scenarios with up to 137 generations needed to dominate controls.