MoEITS is an information-theoretic algorithm for pruning experts in MoE-LLMs that produces models with higher accuracy and greater size reduction than prior state-of-the-art methods on Mixtral 8x7B, Qwen1.5-2.7B, and DeepSeek-V2-Lite.
Jacobs, Michael I
6 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 6representative citing papers
HI-MoE introduces hierarchical scene-then-instance routing in a Mixture-of-Experts detector, yielding gains over dense DINO and flat MoE variants on COCO, especially for small objects.
ICEdit achieves state-of-the-art instructional image editing in Diffusion Transformers via in-context generation, requiring only 0.1% of prior training data and 1% trainable parameters.
A proportional weight-update rule creates implicit binary evaluation signals that propagate losslessly through hierarchical selectors while preserving algebraic market integrity and admitting unique interior equilibria.
The Transformer is recovered exactly as the forward Euler step of spherical SVFlow, with multi-head attention and MoE/FFN as approximations to its vector field.
A low-rank mixture of experts model trained on handwriting data delivers strong Alzheimer's diagnosis performance with substantially reduced parameter activation during inference.
citing papers explorer
-
MoEITS: A Green AI approach for simplifying MoE-LLMs
MoEITS is an information-theoretic algorithm for pruning experts in MoE-LLMs that produces models with higher accuracy and greater size reduction than prior state-of-the-art methods on Mixtral 8x7B, Qwen1.5-2.7B, and DeepSeek-V2-Lite.
-
HI-MoE: Hierarchical Instance-Conditioned Mixture-of-Experts for Object Detection
HI-MoE introduces hierarchical scene-then-instance routing in a Mixture-of-Experts detector, yielding gains over dense DINO and flat MoE variants on COCO, especially for small objects.
-
In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer
ICEdit achieves state-of-the-art instructional image editing in Diffusion Transformers via in-context generation, requiring only 0.1% of prior training data and 1% trainable parameters.
-
Implicit Evaluation Under Minimal Information: Price Formation in Hierarchical Component Selection
A proportional weight-update rule creates implicit binary evaluation signals that propagate losslessly through hierarchical selectors while preserving algebraic market integrity and admitting unique interior equilibria.
-
Transformer as an Euler Discretization of Score-based Variational Flow
The Transformer is recovered exactly as the forward Euler step of spherical SVFlow, with multi-head attention and MoE/FFN as approximations to its vector field.
-
Efficient Handwriting-Based Alzheimer,s Disease Diagnosis Using a Low-Rank Mixture of Experts Deep Learning Framework
A low-rank mixture of experts model trained on handwriting data delivers strong Alzheimer's diagnosis performance with substantially reduced parameter activation during inference.