LFPO: Likelihood-Free Policy Optimization for Masked Diffusion Models | ScienceToStartup | ScienceToStartup