What are the limitations of current generative vision models in understanding complex scenes?Answer not yet generated.