Beyond Test-Time Training: Learning to Reason via Hardware-Efficient Optimal Control | ScienceToStartup | ScienceToStartup