Diffusion LLMs represent an emerging paradigm in large language models, distinguished by their inherent capacity for bidirectional modeling and iterative refinement. Unlike traditional autoregressive LLMs, which generate text sequentially from left-to-right, diffusion LLMs operate by iteratively denoising a noisy input into a coherent output, similar to image diffusion models. This core mechanism allows them to naturally capture complex bidirectional dependencies within data, a significant advantage in tasks where elements influence each other in multiple directions, such as sequence design or optimization problems. By framing tasks like offline black-box optimization (BBO) as a denoising process, these models can generate improved candidates by conditioning on natural language task descriptions and offline datasets. This approach is particularly valuable in fields like DNA sequence design and robotics, where finding optimal designs often involves navigating intricate, interdependent relationships within limited data.
Diffusion LLMs are a new type of AI model that can generate complex designs by iteratively refining noisy inputs, much like how image diffusion models work. They are especially good at tasks where different parts of a design depend on each other in multiple ways, making them useful for optimizing things like DNA sequences or robot movements.
Diffusion-based LLMs, Denoising LLMs
Was this definition helpful?