Mixture-of-Experts (MoE) is a neural network architecture that employs multiple 'expert' sub-networks and a 'router' to dynamically select which experts process different parts of the input. This enables models to scale to billions of parameters while only activating a small subset per input, improving efficiency and specialization.
Mixture-of-Experts (MoE) is a type of AI model that uses multiple specialized sub-networks, called "experts," and a "router" to decide which expert handles each piece of data. This allows models to become much larger and more capable without becoming too slow or expensive to run, by only activating a small part of the model for any given task.
MoE, Sparse MoE, Mixture of Heterogeneous Experts (MoHE), Gating Network, Conditional Computation
Was this definition helpful?