Dialogue Model Optimization via Agent Game and Adaptive Tree-based GRPO | ScienceToStartup | ScienceToStartup