Knowledge Distillation is a model compression technique where a compact student model is trained to replicate the outputs (or intermediate representations) of a larger, pre-trained teacher model. This allows for the deployment of high-performing models in environments with limited computational power or memory.
Knowledge Distillation is a technique where a smaller, more efficient 'student' model learns to mimic the behavior of a larger, more complex 'teacher' model. It's a form of model compression and transfer learning, aiming to achieve comparable performance with reduced computational resources.
| Alternative | Difference | Papers (with Knowledge Distillation) | Avg viability |
|---|---|---|---|
| Federated Learning | — | 1 | — |