From Images to Words: Efficient Cross-Modal Knowledge Distillation to Language Models from Black-box Teachers | ScienceToStartup | ScienceToStartup