LaViT: Aligning Latent Visual Thoughts for Multi-modal Reasoning | ScienceToStartup | ScienceToStartup