Do Vision Language Models Understand Human Engagement in Games? | ScienceToStartup | ScienceToStartup