What is the impact of depth-recurrent transformers on LLM generalization capabilities?Answer not yet generated.