Evaluation format, not model capability, drives triage failure in the assessment of consumer health AI | ScienceToStartup | ScienceToStartup