How can LLMs be optimized for low-latency inference in edge | ScienceToStartup | ScienceToStartup