How can LLMs be optimized for low-latency inference in edge computing environments?Answer not yet generated.