Talk, Evaluate, Diagnose: User-aware Agent Evaluation with Automated Error Analysis | ScienceToStartup | ScienceToStartup