MAR: Efficient Large Language Models via Module-aware Architecture Refinement | ScienceToStartup | ScienceToStartup