How can researchers develop more robust evaluation metrics for LLM performance on rare entities?Reviewed by ScienceToStartup EditorialUpdated 3/27/2026Answer not yet generated.