Translation or Recitation? Calibrating Evaluation Scores for Machine Translation of Extremely Low-Resource Languages | ScienceToStartup | ScienceToStartup