MG-Data-240K is a novel dataset designed for Generalized Multi-image Visual Grounding, addressing limitations in existing datasets regarding target quantity and image relation. It systematically categorizes multi-image grounding tasks to enable unified modeling for complex visual reasoning.
MG-Data-240K is a new dataset created to help advanced AI models better understand and locate objects across multiple images. It fixes problems with older datasets that couldn't handle many targets or complex relationships between images, allowing AI to perform more generalized visual tasks.
Was this definition helpful?