GRIP (Geometric Routing Invariance Preservation) is an algorithm-agnostic framework for machine unlearning in Mixture-of-Experts (MoE) models. It applies a geometric constraint to router gradient updates, forcing knowledge erasure directly from expert parameters rather than through superficial router manipulation.
GRIP is a new method for safely removing specific information from large AI models that use a Mixture-of-Experts (MoE) design. It works by making sure the model truly forgets the information by changing the core "expert" parts, instead of just hiding it by messing with how the model chooses which expert to use. This helps maintain the model's overall usefulness while ensuring sensitive data is properly erased.
Geometric Routing Invariance Preservation
Was this definition helpful?