SD 1.5 BoxDiff is a variant of the Stable Diffusion 1.5 text-to-image model, primarily used as a baseline to evaluate how well generative models follow explicit spatial instructions. It is benchmarked for its ability to render objects according to specified pairwise spatial relations.
SD 1.5 BoxDiff is a version of the Stable Diffusion AI model used to test how well AI can place objects correctly in images based on text descriptions. It helps researchers understand if AI models can accurately follow instructions like 'put the cat to the left of the dog.'
BoxDiff
Was this definition helpful?