STAPO: Stabilizing Reinforcement Learning for LLMs by Silencing Rare Spurious Tokens | ScienceToStartup | ScienceToStartup