Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient Decoding | ScienceToStartup | ScienceToStartup