How can vision language models improve the accuracy of objec | ScienceToStartup | ScienceToStartup