How do depth-recurrent transformers differ from standard recurrent neural networks in their application to LLMs?Answer not yet generated.