How can vision-language models be leveraged for explainable | ScienceToStartup | ScienceToStartup