How does early vision-language fusion enhance generative models for dataset distillation?Answer not yet generated.