Skip to main content
Dual-Space Knowledge Distillation with Key-Query Matching for Large Language Models with Vocabulary Mismatch | Buildability Receipt | ScienceToStartup