MoE routing is the mechanism in Mixture-of-Experts (MoE) models that dynamically selects a sparse subset of specialized expert networks to process each input token. This allows models to achieve vast parameter counts and high capacity while maintaining efficient inference by only activating a fraction of the total parameters per computation.
MoE routing is a smart way for large AI models to work efficiently. Instead of using all parts of the model for every task, it picks only a few specialized parts, saving a lot of computing power and energy. This allows AI models to become much bigger and more capable without becoming too expensive to run.
MoE gate, router network, expert routing, sparse activation
Was this definition helpful?