Causally Robust Reward Learning from Reason-Augmented Preference Feedback | ScienceToStartup | ScienceToStartup