What are the limitations of current LLM evaluation methods when assessing long-tail knowledge acquisition?Answer not yet generated.