Stable Adaptive Thinking via Advantage Shaping and Length-Aware Gradient Regulation | ScienceToStartup | ScienceToStartup