Logits space hinge loss is a specialized loss function designed for quantization-aware machine unlearning. It forces the output logits of an unlearned model to differ from the original model by a specific margin for forgotten examples, ensuring that removed knowledge remains unrecoverable even after low-bit quantization.
Logits space hinge loss is a new technique for machine unlearning that prevents forgotten information from reappearing when AI models are compressed for deployment. It works by making sure the model's outputs for 'forgotten' data are significantly different from its original outputs, even after the model's internal numbers are simplified through quantization.
Logits Space Hinge Loss
Was this definition helpful?