GIST: Multimodal Knowledge Extraction and Spatial Grounding via Intelligent Semantic Topology | ScienceToStartup