How Far Are Vision-Language Models from Constructing the Real World? A Benchmark for Physical Generative Reasoning | ScienceToStartup | ScienceToStartup